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The Ninety-Fourth Issue of the Psychological Barometer 
and a Note on Its Fifteenth Anniversary 


Henry C. Link 
The Psychological Corporation, New York City 


By the time this is published, the Psychological Barometer will have 
passed its fifteenth anniversary. It was in March, 1932, that fifteen 
psychologists, cooperating with The Psychological Corporation, made a 
survey by means of 1578 personal interviews in as many homes in fifteen 
cities from coast to coast. The psychologists volunteered their time and 
that of their students in making this survey. The report of this survey 
in the Harvard Business Review of January, 1933 (4), gives complete de- 
tails as to methods and results. Additional psychologists cooperated in 
several enlarged surveys which helped to establish the Psychological Ba- 
rometer on a firm, self-supporting basis. The second report was made 
in the Journal of Applied Psychology in February, 1934 (5). 

From the outset, the Barometer has been used as « measure of public 
opinion, of public attitudes, of ability to identify certain advertising 
themes, of public behavior, especially buying, reading and radio listening 
habits. The first survey reported the development of the triple associates 
test for measuring an important aspect of advertising effectiveness (4). 
This test, illustrated by such a question as this: What coffee advertises: 
“Look for the Date on the Can’? is now one of the chief supports of the 
\_arometer. It has also become a standard test in the fields of adver- 
tising, politics and propaganda generally. 

Probably the earliest continuous public opinion poll made entirely 
with personal interviews is recorded in the following results from six 
Psychological Barometer surveys (11), shown at top of page 106. 

Since November, 1937, in two ten thousand interview Barometer 
surveys each year, the favorable or unfavorable attitudes toward eight 
of the country’s leading companies are ascertained. These eight com- 
panies, who also underwrite this service, make extensive use of it in plan- 
ning and in measuring the effects of their public relations activities. 
This is the oldest continuous series of public attitude or public relations 


105 





106 Henry C. Link 





Question: From what you have seen of the National Recovery Act in your neighbor- 
hood, do you believe it is working well? 








Oct. Nov. Jan. Apr. Sept. Jan. 





Answers 1938 1933 1934 19384 19384 1935 
% % % % % % 

Yes 48 41 55 50 38 38 
No 27 30 22 23 26 33 
Uncertain 25 29 23 27 36 29 
Total Interviews 1932 2386 3076 5167 4000 3710 





surveys in the field. In 1944 another semi-annual attitude Index for 
another group of companies was begun. 

The use of these Barometers in measuring people’s brand buying 
habits, described elsewhere, continues. Some day, as a result of these 
attitude and behavior measures, we may be able to make interesting 
contributions to the problem of the relationship between attitudes and 
behavior. The fact that these surveys are made four times a year with 
10,000 interviews and twice a year with 5000 interviews provides an unu- 
sually broad base for research of this kind. 

Other aspects of these Barometer surveys are described in the pub- 
lications listed at the end of this report. 

Probably the chief limitation of the Psychological Barometer is that 
it has always been based on an urban sample. This has made it impos- 
sible to use these surveys for predicting elections, the feature by which 
the Gallup and Fortune surveys have become best known. 

A source of great regret to the writer is the fact that surveys like the 
Barometer, the Gallup, Crossley, Roper and other polls are not more 
widely recognized as the peculiar instruments of psychological research. 
Every so often people who learn that I am a psychologist ask me what a 
psychologist does. When I tell them that one of my chief activities is the 
conduct of opinion and behavior surveys, they often exclaim: “But what 
has that got to do with psychology?” Whereupon I may answer that 
psychology is often defined as the scientific study of the mind, and opinion 
polls are scientific attempts to measure the public mind. Or I may tell 
them that psychology is defined as the study of behavior or habits, and 
surveys are scientific attempts to measure the behavior and habits of 
representative samples of people. 

It seems a pity that so many people, including college graduates, have 
so little understanding of psychology and its major fields. Public opinion 
polls, for instance, are identified with Dr. Gallup far more frequently 
than with psychology, even though Gallup is a psychologist. They are 
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identified with Fortune and other magazines, with the fields of journalism, 
politics, political science, but hardly at all with psychology. And yet, 
historically, methodologically and in almost every other way, public opin- 
ion polls and behavior surveys are peculiarly the field of individual and 
social psychology. 

Psychologists themselves, rather than the public or other disciplines, 
are responsible for this situation. What has happened here is much like 
that which has happened in the fields of motion study and personal coun- 
seling or clinical psychology, to mention only two. Motion study in 
this country was taken over by the engineers and efficiency experts. 
Clinical psychology has been increasingly taken over by the psychiatrists. 
The very name, clinical psychology, implies the field of crisis therapy, the 
medical field. Instead of distinguishing its own unique approach, espe- 
cially in the great field of near-normal behavior problems, the very use 
of such terms as clinical, therapy, and others has helped to push what 
should be psychological problems into the field of medicine. A few psy- 
chologists, including Cyril Burt in England, have recently begun to pro- 
test against this trend now so powerfully established. 

It is sometimes said: What difference does it make whether a given 
field falls into one discipline or another, so long as it is being adequately 
handled? The question is: Are these fields being adequately handled by 
the other disciplines? Or even, can they be? So far as vocational, edu- 
cational and certain emotional problems are concerned, many psycholo- 
gists would answer with a positive no. They would agree that the pre- 
dominantly abnormal or psychiatric approach to many of these problems 
can not take the place of the approach of a psychology predominantly 
interested in the more normal. 

In recent years psychologists have made considerable progress in 
catching up with the procession in the field of public opinion and be- 
havior surveys. To be sure, the prize plum of recent years is the study of 
of sex, being made by the zoologist, Alfred 8. Kinsey, under a substantial 
grant from the Rockefeller Foundation (3). Since its beginning about 
eight years ago, personal interviews have been made with over 12,000 
people and 100,000 iaterviews are planned over a period of years. The 
techniques used are those which psychologists, amoug all the professions, 
have probably done most to develop—the interviewing technique, the 
clinical case study, the questionnaire, the standardized inventory, the 
statistical treatment of such data, and sampling methods. 

The name, Psychological Barometer, was chosen in the first place be- 
cause of our conviction that the techniques involved stemmed directly 
from the basic concepts and methods of psychology, both personal and 
social. The Gallup and the Roper or Fortune polls are psychological 
barometers just as much as is our poll. However, even though Gallup 
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is a psychologist and Roper has made good use of psychologists, these 
polls are not generally regarded as integral aspects of psychology. Gal- 
lup’s contributions to this field are known to the initiate, and his book, 
The Pulse of Democracy (Gallup, George and Rae, Saul Forbes. New 
York: Simon & Schuster, 1940), in the writer’s opinion, marks the be- 
ginning of a new era in Social Psychology. And yet, it was only after he 
had achieved wide public recognition that some psychologists began to 
take him seriously. 

When we first began the Psychological Barometer, some psychologists 
even questioned its right to the title, psychological. This was partly 
because our surveys dealt with such commercial matters as buying habits 
and advertising. Once more, the traditional distinction between pure 
research and applied research worked to the disadvantage of psycholo- 
gists. It hindered their grasping quickly the great possibilities of such 
surveys not only for pure research but for applied research in the wide 
ranges of social and personal psychology. 

During the past decade there has been a great increase in the number 
of psychologists interested in survey techniques. However, most of 
this interest has been concentrated in the field of public opinion and po- 
litical polls. Many psychologists still fail to see that, from the stand- 
point of progress in techniques, advertising and market surveys have far 
more to offer than do political opinion surveys. To measure people’s 
buying behavior requires far more exacting methods than to measure 
their voting behavior. The voter selects from only a few candidates or 
maybe only between two or three political parties. The buyer selects 
from hundreds of brands of coffee, soap, soups, and other product classi- 
fications. The voter chooses once a year or less often. The buyer 
chooses every day. Only qualified citizens of age may vote, but every 
individual may buy. 

In political terms, the scope of economic democracy, or democracy of 
the market place, far exceeds that of political democracy. In scientific 
terms, market research offers far greater possibilities than does political 
research. The latter can validate its findings only once a year, the former 
every day in the year. The former has one behavior standard for valida- 
tion, the latter has all kinds of established objective standards. The 
public opinion researcher may spin theories almost at will about survey 
results. The market researcher whose survey findings do not jibe with 
U. 8. Census Bureau facts or the appropriate data from thousands of 
other objective standards would soon be exposed. 

Therefore, the combination of market and public opinion ‘research 
represented by the Psychological Barometer offers distinct advantages 
for research. It has been gratifying to watch the growing interest of the 
research associates and other psychologists in the possibilities of these 
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Barometer surveys in their courses in social and applied psychology, their 
new textbooks and their own research projects. 


The 94th Psychological Barometer 


The current survey was made with 5000 interviews during October, 
1947, by 380 interviewers under the supervision of 144 psychologists in 
147 cities and towns. It represents a true cross-section of the urban 
population. Two questionnaires were used, one with one-half the sample, 
or 2500 people, the other questionnaire with the other half of the 5000 
people. These two sub-samples were comparable by geographic, sex, 
socio-economic, and other criteria. Each question in this study was made 
with one or the other of these 2500 samples. 

The October survey included several questions on the lighter aspects 
of American life as well as a series on the political issues of the day. That 
is, the questions dealt not only with such topics as ownership of pets, 
musical instruments, and the extent and means of travel but also with 
people’s opinions on the present labor laws, the cost of living and the 
prospects of peace. 

Sampling Method. A modified area sampling method was used. All 
interviews were assigned by the local supervising psychologist by blocks 
and streets in accordance with maps constructed to designate the proper 


socio-economic levels. These maps are made to divide the population 
into four principal groups: the “A” group consisting primarily of owners 
and executives; the “B” group, primarily white-collar and semi-profes- 
sional; the “C” group or skilled factory and transportation workers; ‘‘D”’ 
group, or the less skilled. About 28 per cent of the sample are union 
members. All interviews were made in the home, but only one in a 
family; half were made with women, half with men. 


The Ownership of Pets 


In the October Psychological Barometer, a question asked in October 
1938 was repeated in order to measure trends in the ownership of pets. 
The results for the two studies are given below. 


Q. “Do you have any pets in your home? What are they?” 








Oct. 
Answers 1938 


Dog 27% 

Cat 13 

Canary, parrot, or other bird 8 

Fish 2 

Others 2 
Total Interviews 
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Thus, while the percentage owning dogs and cats has not changed at 
at all, canaries, parrots and other birds are owned by fewer urban families 
today than in 1938. The detailed results show that pets are owned just 
as frequently by the poor or middle class families as by the more affluent. 


The Playing of Musical Instruments 


Q. “Does anybody in your family play a musical instrument, and if so, who plays 
and what?” 








Socio-Economic Groups 
Answers A B C D 








Total families in which musi- 
cal instruments are played 38% 31% 


1 member of family plays 23 
2 members of family play 7 
3 members of family play 1 
4 members of family play 2 


Total families in which no 
members play 


Total Interviews 





* Less than .5%. 


Kinds of Instruments Played 











Instruments 








Piano 
Violin 
Clarinet 
Guitar 
Trumpet 
Saxophone 
Drum 
Accordion 
Harmonica 
Cello 
Miscellaneous 
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* Less than .5%. 


Note: The total per cents in the above table add to more than 100 because some 
people play more than one instrument. 
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A Question of Chivalry 


Q. “Do you think a man should stand up and give a woman his seat in a bus, train 
or street car?” 








Socio-Economic Groups Sex 
Answers Total ; A B Cc D Men Women 











Yes, give up seat 42% 36% 37% 42% 51% 42% 41% 
No 13 13 13 13 10 12 14 
Depends 45 51 50 45 39 46 45 





While men and women agree quite closely on this subject, the people of 
higher education and wealth are less likely to believe in this form of chiv- 
alry than do those of less education and wealth. 


Extent and Means of Travel 


Q. “How many trips of 500 miles or more one way did you make since last October?”’ 








Socio-Economic Groups 


Answers A B Cc D 








None 51% 65% 75% 


1 trip 25 2 18 
2 trips 11 
3 trips A 
4 trips 3 
5 trips 1 
Over 5 trips 4 
Don’t know 1 





* Less than .5%. 


Q. “How did you go, by train, air, bus, or auto?” 








Socio-Economic Groups 
Answers A D 








Auto 46% 41% 
Train 32 33 
Air 18 5 
Bus 2 9 
Ship ° 2 
Don’t know 2 10 





* Less than .5%. 
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Q. “If the cost were the same, how would you go next time, by train, air, bus, or 
auto?” 








Socio-Economic Groups 





Answers A B Cc D 





Auto 42% 47% 419% 
Train 18 


17 

Air 31 27 
2 

1 

4 


Bus 1 
Ship _ 
Don’t know 3 





* Less than .5%. 


According to these results, more than twice as many people would 
travel by air if the cost were the same than have travelled by air since 


October 1946. Trains would suffer the greatest loss of passengers if the 
costs were the same. 


Whom Do the Labor Laws Favor 


Q. “Do you think that the present laws regulating labor unions favor business, 
favor labor unions, or are fair to both?” 








Socio-Economic Groups 





Answers A B C D 





Are fair to both 44% 42% 41% 39% 
Favor labor unions 31 25 22 16 
Favor business 12 18 20 19 
Don’t know 13 15 17 26 











Union Membership 








Union Non- 
Answers Members Union 





Are fair to both 40% 42% 
Favor labor unions 25 
Favor business 30 14 
Don’t know 15 19 
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Labor Union Monopolies vs. Company Monoplies 


Q. ‘Which monopolies are more dangerous: monopolies by big companies or monop- 
olies by big labor unions?” 








Socio-Economic Groups 


























Answers Total A B Cc D 
Labor union monopolies 31% 438% 33% 32% 22% 
Company monopolies 16 9 16 16 18 
Equally dangerous 42 44 45 41 39 
Don’t know 11 4 6 11 21 
Sex Union Membership 
Union Non- 
Answers Men Women Members Union 
Labor union monopolies 29% 34% 21% 35% 
Company monopolies 16 16 24 13 
Equally dangerous 48 36 43 42 
Don’t know 7 14 12 10 





The Cost of Living 


In view of the current concern over the high cost of living, we included 
several questions on what people thought were the reasons for the high 
prices, how they thought prices could be reduced, and what they thought 
of a return to rationing. 

Q. “What do you think are the main reasons for the high prices of food and other 
products?” 

A classification of the responses to the above question revealed that 
more people blame the Government or labor unions for high prices today 
than blame any other agencies. 

















Socio-Economic Groups Union Membership 
Union Non- 
Answers Total A B Cc D Members Union 
The Government 29% 31% 29% 29% 28% 31% 28% 
Unions and strikes 28 32 30 28 22 24 29 
Business companies 16 10 15 17 19 19 14 
Scarcities 10 14 11 9 6 8 10 
Inflation 9 12 ll 8 6 7 10 
Farmers 3 4 3 3 4 3 3 
Miscellaneous 24 27 25 23 21 25 23 


Don’t know 12 8 10 12 17 il 12 
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Those people who named the Government as primarily responsible 
gave as the principal reasons: Government shipping goods overseas, 
Government subsidies keeping prices up, Government removal of price 
controls, Government politics, high taxes. Those who named labor gave 
as more specific reasons: the high wages unions were asking, too many 
strikes. Those who named manufacturers and business concerns as the 
cause said that companies were making too much money, were holding 
back their products for higher prices, were charging too much for their 
services as middle man. Among the miscellaneous reasons given were: 
aftermath of the war, not enough consumer resistance, increased cost of 
production, speculation. 


Q. “Because of the high prices and shortage of food, do you think we should go 
back to food rationing?” 








Socio-Economic Groups Union Membership 








Union Non- 
Answers Total A B C D Members Union 





No 65% 71% 64% 638% 65% 62% 66% 
Yes 30 26 31 31 27 34 28 
Don’t know 5 3 5 6 8 4 6 





Q. “Do you think prices can best be reduced by more government regulation or 
by fewer government regulations?” 








Socio-Economic Groups Union Membership 








Union Non- 
Answers Total A B C dD Members Union 





By more gov’t 

regulations 44% 36% 438% 46% 48% 42% 
By fewer gov't 

regulations 42 56 46 35 38 45 
Don’t know 14 8 11 19 14 13 





Changes in Family Prosperity 


Included in the October Psychological Barometer was a question we 
have been asking consistently since October 1941 in order to measure 
trends in people’s beliefs about their living standards. 


Q. “Is your family more prosperous or better off today than two years ago, less 
prosperous, or the same?” 
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The following table gives the results for the two earliest studies and 
for the last two: 











Oct. Oct. April Oct. 

Answers 1941 1942 1947 1947 

More prosperous 38% 29% 29% 24% 
The same 47 47 42 46 
Less prosperous 15 21 26 28 
Uncertain == 3 3 2 





The group which seems, to a slight degree, to be best off, according 
to its own answers, is the “B”’ or white-collar and semi-professional group. 
The opinions of union members are very much like those of non-union 
members. 

















Socio-Economic Groups Union Membership 
Union Non- 
Answers A B Cc D Members Union 
More prosperous 23% 27% 24% 20% 23% 25% 
The same 48 46 44 49 44 46 
Less prosperous 26 25 29 28 29 27 
Uncertain 3 2 3 3 4 2 





Beliefs About Socialism in England 
Q. “Did you know that England has a labor socialist government now?” 








Socio-Economic Groups 








Answers Total A B C D 
Yes 66% 84% 80% 63% 42% 
No 24 10 15 26 39 
Don’t know 10 6 5 11 19 





Q. “Do you think that socialism in England will succeed or fail?” 








Socio-Economic Groups 








Answers Total A B Cc D 
Fail 47% 57% 56% 48% 29% 
Succeed 12 19 13 10 ll 


Don’t know 41 24 31 42 60 
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The Diminishing Prospects for Peace 
Since 1943, we have asked in seven different surveys the question: 


Q. “After this war (or, now that the war is over) do you think that we will make a 
peace settlement that will last or do you think that we will have another world war in 
twenty-five years or so?” 

The responses to this question indicate that more people expect war 
today than ever in the last five years. 








Feb. Oct. Oct. Oct. 
Answers 1943 1944 1945 1947 


Will have another war 43% 54% 59% 77% 


Will make a lasting 
peace 47 28 28 18 11 


Don’t know 10 18 13 8 12 








The expectation that Russia will be the next enemy has been con-. 
stantly increasing. 
Q. “Who do you think will be our next enemy?” 








Oct. Oct. 
Country Named 1944 1945 


Russia 29% 37% 
Germany 9 2 
Japan 5 5 
England 4 3 
China 1 1 
Don’t know 6 ll 


Total % Expecting War 54 59 








More than half of those who said in October that they expected war 
said they expected it within 10 years. 
Q. “In how many years do you think it will be?” 








Answers Oct. 1947 





1 year or less 4% 
2-4 years 10 
5-10 years 37 
11-19 years 12 
20-25 years 8 
Over 25 years or uncertain 6 


Total % Expecting War 77 





Received January 30, 1948. 
Early publication. 
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Studies of Job Evaluation. 7. A Factor Analysis of 
Two Point Rating Methods of Job Evaluation 


C. H. Lawshe, Jr., Edmund E. Dudek, and R. F. Wilson 
Division of Applied Psychology. Purdue University 


As job evaluation techniques become more widely and frequently 
used, more and more questions concerning the applicability and effective- 
ness of these systems arise. Some of these questions relate to the types 
of evaluation systems available, the jobs to which these systems are ap- 
plicable, the number of scale items needed for effective evaluation, the 
reliability of scales of different lengths, and the number of separate and 
distinct factors actually involved in these scales. In previous studies 
(5, 6, 7, 8, 9, 10), information has been presented bearing on several of 
these problems. 

In this study an attempt is made to obtain some information concern- 
ing the basic factors involved in two different point-rating scales, viz., 
the NEMA Job Evaluation System (4) and a Simplified Job Evaluation 
System devised by the senior author (10). Specific questions were: 


What are the separate and distinct factors which are operating in these 
two systems? Which factors do the systems have in common and which, 
if any, are specific to one or the other system? And, how great a dis- 
crepancy in factor loadings or weights is there between the two systems 
in the factors which they have in common? Answers to these questions 
will help to indicate to what extent the same factors or elements are 
evaluated by the two methods. 


Procedure 


The Job Evaluations Systems. A factor analysis was made of the in- 
tercorrelations between ratings of forty jobs made by twenty analysts 
using two job evaluation methods. The NEMA System, as adopted by 
the National Electrical Manufacturers Association, provides for the rating 
of jobs on eleven items in four categories: namely, Skill (Education, Ex- 
perience, and Initiative and Ingenuity), Effort (Physical Demand and 
Mental and Visual Demand), Responsibility (Equipment or Process, 
Material or Product, Safety of Others, Work of Others), and Job Condi- 
tions (Working Conditions, and Unavoidable Hazards). Each item is 
rated on a weighted five-point scale. The Simplified System provides 
for ratings on four items: Learning Period, General Schooling, Working 
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Table 1 
Occupations Rated 








Job USES USES D.O.T. NEMA 
Number Code Job Title Labor Grade 





‘1-23.14 Messenger I 

1-38.91 Stockroom Man 

1-38.05 Tool Clerk 

2-61.03 Watchman 

2-82.10 Charwoman 

2-84.10 Janitor I 

2-86.20 Sweeper 

2-95.30 Elevator Operator, Freight 
3-40.04 Grounds Keeper I 
4-75.010 Machinist Maintenance 
4-75.010 Machinist (All Around) 
4-76.210 Tool and Die Maker 
4-78.011 Engine Lathe Operator 
4-78.042 Horizontal Boring and Milling Machine Opr. 
4-78.061 Shaper Operator 

4-78.071 Planer Operator 

4-78.503 Surface-Grinder Operator 
4-80.010 Sheet Metal Worker II 
4-83.100 Boilermaker Maintenance 
4-85.020 Welder, Arc 

4-85.030 Acetylene Welder 
4-86.010 Blacksmith II 

4-87.010 Heat-Treater 

4~-97.420 Electrical Repairman 
5-25.830 Carpenter, Maintenance 
5-27.010 Painter, Maintenance 
5-29.100 Mason—Plasterer 
5-30.210 Plumber, Maintenance 
5-73.010 Electric-Bridge-Crane Operator 
5-78.100 Millwright 

5-83.611 Maintenance Man, Building 
5-84.110 Tool Grinder Operator 
5-88.020 Rigger III 

6-77.020 Buffer 

6-78.611 Band Saw Operator 
6-78.632 Assembler 

7-70.010 Fireman, Low Pressure 
9-65.14 Boiler Cleaner 

9-71.01 Oiler I 

9-55.02 Automobile Washer 


ComNoO or WN 
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Conditions, and Job Hazards. The items provide for rating in five, six, 
or seven defined degrees. A more complete description has been pre- 
viously given by Lawshe and Wilson (10). 

The Job Descriptions Rated. Ratings were made of forty jobs from 
job descriptions which had been adapted from the USES National Job 
Description Series (2, 3). Twenty-four of these jobs were classified by 
the Dictionary of Occupational Titles as skilled, four were semiskilled, three 
unskilled, and the remaining nine as clerical, service, and agricultural. 
A list of these jobs, their corresponding USES code numbers, and NEMA 
labor grades is presented in Table 1. 

The Ratings. Twenty analysts, most of them personnel department 
supervisors, job evaluation supervisors, or job analysts, participated in 
this study. As discussed in a previous article (10) each analyst rated only 
twenty of the forty jobs. There were actually ten independent ratings 
of each job; five by the NEMA System, and five by the Simplified System. 
For each job the item ratings made by the five analysts were averaged; 
thus giving a composite rating on each item and the total for each job. 
The reliabilities of the several items in the NEMA System, determined by 
intercorrelating the ratings of five analysts were previously reported and - 
discussed by Lawshe and Wilson (10), as ranging from .72 to .96, and re- 
liability for ‘“Total Points” was .94. The reliabilities of the items in the 
Simplified System, obtained in the same way, ranged from .84 to .97, and 
“Total Points” reliability was .98. These reliabilities are shown in the 
last column of Table 2. 

Intercorrelations and Factor Analysis. The average item ratings for 
the forty jobs were then intercorrelated and the resulting r’s are shown 
in Table 2. This correlation matrix was factor analyzed using the cen- 
troid method as described by Guilford (1). After five factors were ex- 
tracted the process was discontinued when several criteria! indicated that 
additional factors would be the result of chance variance only. The 
rotations were performed using the graphic method described by Guilford 
(1) and the factor loadings before and after rotation are presented in 
Table 3. 

Findings and Interpretations 

Identification of Factors. In Table 4 the scale items are ranked under 
each factor according to size of rotated factor loading (.400 or above). 
These loadings were used in defining the factors. Loadings on “Total 
Points” were not considered in defining factors. 


1 For example, the sum of the guessed communalities equaled 14.381; the sum of the 
computed communalities after four factors equaled 13.970, after five factors 14.525. 
Using Tucker’s @ test (11), the limiting value for § was determined at .889, and § for 
four factors was .859, while for five factors it was .890. The largest cell value in the 
fourth residual matrix was .126, in the fifth .095. 
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Table 4 
Factor Names with Scale Items Arranged in Order of Magnitude of Loadings * 








Factor Item and Loading 


I—Skill Demands (General) 1—Education 
14—Learning Period 
13—General Schooling 
3—initiative and Ingenuity 
2—Experience 
7—Resp.— Material 
9—Resp.—Work of Others 
16—Job Hazards 
5—Mental and Visual Demands 





IJ—Supervisory Demands 9—Resp.—Work of Others 
8—Resp.—Safety of Others 
5—Mental and Visual Demands 
2—Experience 
3—Initiative and Ingenuity 
1—Education 
7—Resp.— Material 

14—Learning Period 
6—Resp.-—Equipment 


III—Job Characteristics— 4—Physical Demands 
Non-hazardous 10—Working Conditions 
15—Surrounding Conditions 
16—Job Hazards 


IV—Job Characteristics— 16—Job Hazards 
Hazardous 11—Unavoidable Hazards 


V—Job Responsibility 7—Resp.— Material 
6—Resp.—Equipment 
13—General Schooling 
5—Mental and Visual Demands .440 
14—Learning Period 434 
3—Initiative and Ingenuity .420 
15—Surrounding Conditions 404 





*Items from NEMA System appear in light face and those from the Simplified 
System are in bold face. Only loadings of .400 or greater are listed. 


“Skill Demands, General.”” Factor I appears to be a general intel- 
lectual ability or general skill factor. It has high loadings for items from 
both systems, namely: from the NEMA, Education .774, Initiative and 
Ingenuity .730, and Experience .717; and from the Simplified, Learning 
Period .749, and General Schooling .730. It corresponds to the “Skill 
Demands, General” factor found in previous studies (5, 7, 8, 9) and has 
been tentatively designated as such. 
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“Supervisory Demands.” Factor II appears to be a specific to the 
NEMA system. The highest loadings for NEMA items were: Respon- 
sibility for Work of Others .650, Responsibility for Safety of Others .647, 
Mental and Visual Demands .563, and Experience .531. Only one item, 
Learning Period, from the Simplified system appeared in this factor with 
a medium loading of .428. The two items with the highest loadings, 
Responsibility for Work of Others and Responsibility for Safety of Others, 
seem to indicate this factor is largely one of supervision. The fact that 
the other two responsibility items, Responsibility for Equipment or Proc- 
ess and Responsibility for Material or Product, did not show up high in 
this factor seems to lend additional support to the conclusion that this is 
a Responsibility for People, i.e., Supervision, rather than a ‘‘Responsi- 
bility for one’s own work” factor. The appearance of several other items 
with medium loadings in this factor suggests that it may possibly involve 
general responsibility but, due to the isolation of another responsibility 
factor (Factor V), it seemed best to define this factor tentatively as 
“Supervisory Demands.” This corresponds to a factor found in a pre- 
vious study (7). 

“Job Characteristics—Non-Hazardous.”’ Factor III appears to be a 
clear-cut factor pertaining to the physical characteristics of the job without 
regard to skill demands. The high loadings from the NEMA scale were 
for Physical Demands .875 and Working Conditions .854; for the Sim- 
plified scale for Surrounding Conditions .843, and a medium loading for 
Job Hazards .433. This factor is very similar to factors found in pre- 
vious studies (5, 7, 8, 9) and has been similarly called “Job Character- 
istics, Non-Hazardous.”’ 

‘Job Characteristics—Hazardous.”’ Factor IV is another rather clear- 
cut factor. Only one item from each scale had significant loadings in 
this factor, from the NEMA, Unavoidable Hazards .655, and from the 
Simplified, Job Hazards .689. This then seems to refer to the hazardous 
conditions involved in the job and it has been named “Job Character- 
istics, Hazardous” which is similar to a factor found previously (5). 

“Job Responsibility.’”’ Factor V appears to be another responsibility 
factor. Items with highest loadings from the NEMA scale were Respon- 
sibility for Material or Product .617 and Responsibility for Equipment or 
Process .592, and from the Simplified scale, General Schooling .567. 
Medium loadings were found for Mental and Visual Demands .440, and 
Initiative and Ingenuity .420 from the NEMA, and for Learning Period 
.434 and Surrounding Conditions .404 from the Simplified scale. Em- 
phasis seems to be on responsibility for material things, or for one’s own 
work rather than for work of others, or a matter of carefulness instead of 
supervision. It has therefore been designated “Job Responsibility.” 
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Discussion of Results 


Of the five factors found, Factors I, III, and IV appear to be quite 
clearly defined while Factors II and V are much more general in nature. 
There seems to be some indication, from the items having medium 
loadings in these latter two factors, that these two are composite factors 
containing some item variance which might be drawn out by a sixth 
factor if provision were made for its isolation by inclusion of several other 
items. This factor might be of the nature of “Specific Skill Demands’’. 

The authors do not intend to imply that the five factors identified here 
would completely account for job evaluation of all types of jobs. There 
are numerous other occupations in the professional, clerical, and skilled 
ranks which probably could not be adequately evaluated on such a scale. 
However, if it can be assumed that these two job evaluation systems in- 
clude all of the important items necessary for evaluation of jobs for which 
they were designed, it seems that a scale comprising five factors could 
satisfactorily achieve the desired purpose. 

The hypothesis is suggested that for each ‘‘family”’ or group of similar 
occupations a core of items designed to measure the five factors found in 
this study may be used in setting up a job evaluation rating system. 

It is possible that another item, specific to the occupations in question, 
should be added to allow for evaluation of any unusual aspects that are 
not general to all or most occupations. This hypothesis is supported by 
the findings of Lawshe and Satter (5) who found a factor called “Attention 
Demands,”’ specific to jobs in a plant manufacturing small caliber am- 
munition where many jobs consisted of machine “‘attending”’ and visual 
inspection. This factor was not found in two other plants of a different 
nature where jobs did not require “‘attention”’ or “inspection” to as high 
a degree. It is further supported by the findings of Lawshe and Alessi 
(8), in that a factor identified as “Skill Demands—Specific” (in addition 
to a “Skill Demands—General’”’ factor) was found. The results from 
these studies suggest the advisability of including an item specific to the 
jobs in question as well as items based on the basic factors found in this 
study but the testing of this hypothesis must be undertaken later. At 
any rate, it appears that by employing a job evaluation system consisting 
of item scales based on the five basic factors found, considerable time and 
effort could be saved in comparison to that involved in using a longer 
system, and, at the same time, as complete an evaluation could be made 
as the longer system permits. 

It should not be inferred, however, that the authors recommend im- 
mediately abolishing presently used scales in favor of a short, five or six 
item scale. It is realized that frequently it is desirable to use items pos- 
sessing what might be termed “face validity” in spite of the fact that 
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such items contribute nothing additional to the scale. That is, if super- 
visors and employees believe that a certain element is important, and are 
agreed that it should be included, it may be highly desirable that it be 
included for policy reasons even though it may be statistically shown that 
its contribution is nil. Care should be taken, though, to determine that 
such an item does not detract from the reliability (and, if possible, from 
the validity) of the scale if it is introduced. 

Comparison of Systems. Figure 1 indicates the relative variance in 
the two systems attributable to each of the five factors. The “total 


I SKILL DEMANDS (GENERAL) 


ID SUPERVISORY DE MANOS 


TIE JOB CHARACTERISTICS—NON HAZARDOUS FA.°* 


I JOB CHARACTERISTICS—HAZARDOUS i>. 


5 
WZ RESPONSIBILITY pee hs 


Fic. 1. Relative contribution of each factor to total variance in the two systems. 
The shaded bars represent the NEMA System and the solid bars represent the Simpli- 
fied System. 


points” factor loadings for the NEMA System and for the Simplified 
System respectively were squared and converted to per cent of h? for total 
points for each factor. It will be observed that Factors I and V carry 
the most weight in the Simplified System and Factors I and II in the 
NEMA. This can be partially accounted for by considering the standard 
deviations of the ratings of the several items in each system. These 
8.D.’s are shown in Table 5. Inasmuch as the S.D.’s for items (1) Edu- 
cation, (2) Experience, and (3) Initiative and Ingenuity in the NEMA 
System are considerably larger than the S8.D.’s for the other items, and 
the “Total Points” score was obtained by a simple summation of the 
point ratings on all items, it follows that the factors which included these 
three items would account for a major portion of the variance (h?). Simi- 
larly, in the Simplified System, items (13) General Schooling and (14) 
Learning Period, having relatively much larger 8.D.’s, would contribute 
most to total points and to the resulting variance. It is not suggested 
that a similar distribution of variance would be found with other systems 
nor that the distribution found here is an optimum. It merely indicates 
the relative weight or importance of these factors taking into account the 
actual weights that were assigned to the items included in each system. 

It is appreciated that the factor loadings depend in part on the rota- 
tions made between the factors found and that different rotations would 
have resulted in different loadings and therefore different contributions to 
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Table 5 
Standard Deviations of Ratings on Individual Items 








Approximate 
Item 8.D. Ratio* 





Education 13.10 
Experience 23.54 
Initiative and Ingenuity 14.19 
Physical Demand 6.46 
Mental and Visual Demand 3.19 
Resp.—Equipment 3.25 
Resp.— Material 3.29 
‘Resp.—Safety of Others 3.58 
,Resp.-—Work of Others 3.77 
Working Conditions 7.06 
Unavoidable Hazards 2.98 
Simplified 
13 General Schooling 39.00 
14 Learning Period 45.60 
15 Surrounding Conditions 8.28 
16 Job Hazards 10.60 


KOC ON OHS Ww WD 
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* The “approximate ratio” is derived by dividing each standard deviation by the 
smallest one in that system. 


the communality. However, in this study several factors (I, III, IV) 
were quite clear-cut and only one other reasonable rotation, in addition 
to.those employed (between Factors II and V), seemed possible. This 
rotation did not seem to permit as significant an interpretation of the 
factors. li is therefore believed that the variance contribution of the 
five factors shown in Figure 1, as based on the weights of individual items 
and the most logical rotations of axes, indicates, at least in close approxi- 
mation, the relative importance of these factors. 

The correlation between the average total point NEMA ratings in the 
forty jobs and the average total point ratings under the Simplified System 
was .90 (10). This lack of perfect agreement between the two systems 
may in part be explained by the relative difference in weight or impor- 
tance of the several factors in each system as indicated above. However, 
the statement that there appears to be substantial agreement between the 
two systems seems to be adequately supported. Inasmuch as neither of 
the systems is perfectly reliable, there is reason to believe the ‘‘true”’ 
correlation between them is higher than the observed r of .90 and that it 
may be estimated by attenuating the obtained correlat'2n. The resulting 
r of .94 leads to the conclusion that if the measures were more reliable 
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the correlation between them would approach 1.00 and that, in effect, they 
both measure the same variable or variables. 

Validity. This study does not attempt to answer any questions or 
draw any conclusions regarding the validity of either system, since neither 
can be considered a criterion against which to evaluate the other. Be- 
cause it would seem profitable to conduct validity studies using actual 
wage data as a criterion, research along these lines is presently being 
planned. 


Summary and Conclusions 


Twenty analysts rated forty job descriptions by two job evaluation 
systems, the NEMA System and a Simplified System designed by the 
senior author. Intercorrelations were obtained between the item ratings 
made and Thurstone’s centroid method of factor analysis was used to 
determine the fundamental factors accounting for these intercorrelations. 
The following conclusions are supported. 

1. Five factors were found which seem to account for the elements 
considered in the two systems. These factors were tentatively identified 
as: Skill Demands (General), Supervisory Demands, Job Character- 
istics—Non-Hazardous, Job Characteristics—Hazardous, and Job Re- 
sponsibility. 

2. It seems quite possible that other factors not identified here, but 
peculiar to certain industries or job families may be isolated in future 
studies. 

3. It appears from the available evidence that, for accurate and com- 
plete job evaluation, fewer factors are necessary than are usually used in 
present job evaluation systems. The desirability of further investiga- 
tions to isolate and more clearly define these factors is indicated. 

4. No conclusions about validity can be drawn from this study due 
to the lack of a suitable criterion. Investigation of this problem also 
appears desirable. 

5. Although short job evaluation systems consisting of only a few 
items may be statistically and logically justified, it may be practically 
advantageous to include additional items in the system which will make 
it more acceptable to raters and to employees. 


Received October 10, 1947. 
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Improving the Selection of Linotype Trainees 
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George B. Strother 
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In order to meet the increased educational demands produced by the 
GI Bill of Rights, the University of Missouri has inaugurated several non- 
collegiate vocational courses for veterans. These courses are in fields 
where demand for skilled personnel is high. Institutional or on-the-job 
training facilities seem inadequate to meet the demand. Such a course 
is the one semester linotype operators school at the University of Missouri. 
The school is integrated with an on-the-job training program on com- 
pletion of the school. The situation in the linotype school is ideal for 
the use of selection tests since there is a waiting list of students and a 
high rate of turnover. Turnover is used here broadly to include those 
who do not complete the course, those who complete the course with in- 
ferior ratings, and those who prove unsatisfactory on the job. 

Much of the demand for operators in the area is in small rural shops 
and the demands on the operator, besides linotype operation, often in- 
clude: the maintenance and repair of the machine; the duties of com- 
positor; proofreading; and even reporting, selling, and editing. It was 
not feasible to obtain criteria on factors other than linotype operation in 
this study. Since it was evident that minimum standards on operation 
would improve selection, even though it ueglected other significant 
variables, this investigation knowingly neglected these other factors which 
are of varying degrees of importance in different types of shops. 

Two consecutive groups totalling twenty-nine persons which took the 
one semester course were tested at the beginning of training using the 
following battery: a. Schrammel-Brannon revision of the Army Alpha; 
b. The Kuder Preference Record; c. The Minnesota Vocational Test for 
Clerical Workers; d. The MacQuarrie Mechanical Aptitude Test; e. The 
Revised Minnesota Paper Form Board; and f. O’Connor-Tweezer Finger 
Dexterity. 
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The criteria studied were terminal grades in the course, lines of type 
set per hour at the end of the course, and errors made. The grade cri- 
terion yielded negligible correlations with test results due to lack of ade- 
quate grade scatter and unreliability of grades. 

Speed and error criteria were combined according to the following 
formula: Score equals lines per hour minus twice the number of errors. 

Errors were weighted in this manner because of the fact that accuracy 
is related to speed not merely in terms of lines culled but in terms of 
spoilage and repeat settings. This criterion was satisfactory in the 
opinion of journeyman operators instructing in the course. 


Table 1 
Means, 8.D.’s and Product-moment r’s of Test Scores with Criterion. N = 29 











Test r with Criterion Mean 8.D. 
Army Alpha 62 168.3 31.7 
Kuder 
Mechanical 31 79.3 17.1 
Computational 02 35.9 9.0 
Scientific 05 57.3 11.2 
Persuasive —.27 65.4 13.4 
Artistic 18 47.0 11.8 
Literary —.08 64.2 17.3 
Musical 08 17.9 10.5 
Social Service —.41 58.2 11.8 
Clerical —.11 61.1 5.7 
Minn. Test for Clerical 
Workers 
Number 54 105.9 17.9 
Name 57 104.1 23.8 
Minnesota Paper Form Board .29 38.7 8.9 
MacQuarrie 
Tracing 24 34.7 6.7 
Tapping 18 34.8 9.1 
Dotting 31 19.2 3.0 
Copying 49 41.1 12.4 
Location .19 27.4 6.9 
Blocks 37 17.6 5.4 
Pursuit 41 26.2 6.1 
Total 59 67.8 7.8 





The number on whom complete data were available was twenty-nine. 
This includes all but two persons who took the initial battery. One of 
those who did not complete the course died. The reason for the other 
drop is not known. Thus the attenuation of correlations due to drops 
resulting from inaptitude is negligible. 
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Table 1 shows the zero order correlation coefficients of test scores 
with criterion. From Table 1 it will be noted that the n:ost significant 
correlations are the Army Alpha, the Kuder Mechanical, Kuder Per- 
suasive, Kuder Social Service, the two sections of the Minnesota Test 
for Clerical Workers, the Minnesota Paper Form Board and the Mac- 
Quarrie Total Score. In order to keep the multiple-regression equation 
within bounds, it was decided to work with only five of the above scores. 
All five tests were retained in the final analysis and the highest coeffi- 
cient for each test was selected (where several scores were obtained). 
An exception to this was made on the Kuder where the highest positive 
coefficient was selected and the higher negative coefficient was bypassed 
because it was felt that the negative interest relationship would be diffi- 
cult to utilize in dealing with applicants or counselees. Multiple correla- 
tion analysis by the Doolittle method! was carried out on the following 
scores: a. Army Alpha; b. Kuder, Mechanical; c. Name section of Min- 
nesota Clerical; d. MacQuarrie, Total Score; and e. Minnesota Paper 
Form Board. 

O’Connor Finger Dexterity showed a product-moment r of .31 and 
might be desirable in such a battery, but was dropped because data were 
incomplete. 

Table 2 shows intercorrelations of the five tests and were the basis of 
the multiple coefficient. The multiple correlation analysis gave a coeffi- 


Table 2 
Inter-Correlations of Variables Used to Determine the Multiple Regression Equation 











Minn. 
Army Kuder Clerical 
Criterion Alpha (Mech.) (Name) M.P.F.B. 

Army Alpha .62 
Kuder (Mech.) 31 — .03 
Minn. Cler. (Name) 57 80 —.23 
Minn. Paper Form Board .29 35 —.17 32 
MacQuarrie (Total) 59 36 PR .37 .07 





cient of +.82 with the criterion. The multiple regression equation and 
standard error of estimate are as follows: 


X = .1111 X,; + .4080 X_ + .2511 X; + 3295 X, + .8151 X, — 62.75. 
R = .82. 
8. E. est. R = + 10.64. 


1 Guilford, J. P. Psychometric methods. New York: McGraw-Hill, 1936, pp. 393- 
397. 
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Improving the Selection of Linotype Trainees 


1 = Army Alpha, Raw Score. 

2 = Kuder Mechanical, Raw Score. 

3 = Minn. Voc. Test for Clerical Workers (Name Comparison), Raw 
Score. 

4 = Minn. Paper Form Board, Raw Score. 

5 = MacQuarrie, Total Raw Score. 


Table 1 gives means and standard deviations of raw scores for the 
group. 

Several facts of interest come out in an inspection of the regression 
equation. Since the raw scores differ widely in variability, the multi- 
pliers in the equation do not give any indication of the relative contri- 


Table 3 
Scattergram of Predicted Scores on Obtained Scores (R = .82) 








Predicted Scores 


65- 70- 75- 80- 85- 90- 95- 100- 105- 110- 
70 75 80 85 90 95 100115 110 115 Total 








110-115 

105-110 

100-105 
95-100 
90-95 
85-90 
80-85 
75-80 
70-75 


45-50 
40-45 


Total 


1 
1 
2 
2 
2 
2 
4 
3 
2 
2 
5 
0 
1 
0 
1 
1 
29 





bution of any given test to the prediction of the criterion. The mechan- 
ical interest factor and the Paper Form Board make substantial contri- 
butions to the equation in spite of relatively low r’s. They seem to meas- 
ure phases of the total picture not otherwise covered to any great extent. 
The Army Alpha, although it shows the highest r of any of the tests, 
apparently overlaps with several of the other tests, so that its contri- 
bution is only nominal. The MacQuarrie and the Minnesota Test for 
Clerical Workers (names) seem to be the most useful tests if a shorter 
battery were desired. 
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Since the number of cases is small (N = 29), some variability is to 
be expected as subsequent*data are accumulated. The present equation 
represents a working formula which is statistically reliable but tentative 
with regard to the actual figures obtained. 

Table 3 represents a scattergram of scores predicted by the multiple 
regression equation and the obtained scores. Applicants for admission 
to the course are to be tested with the battery finally selected and pre- 
diction of their standing will be made. On the basis of this plus an in- 
terview blank, selection of students will be made. It can be readily seen 
that, given any particular selection ratio, the selection of operators should 
be substantially improved by use of the above diagram. The extent of 
improvement will depend upon the favorableness of the ratio involved. 
Given the correlation value, the number of applicants and the number 
of individuals who will be admitted, the Taylor-Russell tables? can be 
applied and the per cent of improvement determined. 


Received Aug. 25, 1947. 
? Tiffin, J. Industrial psychology. New York: Prentice-Hall, 1946, pp. 363-367. 








Per Cent Increase in Output of Selected Personnel 
as an Index of Test Efficiency 


R. F. Jarrett 
University of California 


It recently occurred to the writer that a very useful measure of the 
adequacy of a program of testing for employee selection (or classification) 
would be the ratio of the mean output of a group selected on the basis of 
their high test-scores to the mean output of an unselected! group. A 
search of the literature revealed that Richardson (8) had called attention 
to the desirability of describing the efficiency of a testing program in these 
terms. He has suggested a method of predicting the improvement in 
output of selected workers given the validity coefficient, the selection 
ratio, p (the proportion of applicants selected), and an estimate of a 
constant, k, which is the ratio of the mean output of the upper P%? of 
unselected personnel to the mean output of the lower (100 — P)% of 
unselected personnel. 

He computes the value of E defined by the equation: 


_ (k — 1) — p) 
ae 3% 3 ee (1) 





where r is the validity coefficient of the test, k is the ratio previously de- 
fined, and p is the selection ratio, or the proportion of applicants to be 
selected. 


His solution is based upon the relations between the point correlation 
coefficient and the various proportions in the marginal totals of a four- 


1The term “unselected” must be interpreted throughout this paper to mean, “the 
members of that population of individuals who apply for the job in question and who— 
when individuals were needed for the job in question—would have been put to work 
without further regard for their qualifications before a testing program was initiated.” 
It is necessary thus to qualify the expression because numerous selective factors make 
the “unselected” population for the job of janitor in a New York office building, for 
example, quite different from the unselected population for the job of hosiery looper in 
a mill in Illinois. 't is desired here to evaluate the efficiency of the test; thus “un- 
selected” must be carefully qualified as indicated. 

* Throughout this paper the upper-case P will be used to denote the per cent selected, 
while the lower case p denotes the proportion selected. Thus if 85% of applicants are 
selected, P = 85, but p = 0.85. The letter “gq” similarly will denote the per cent 
rejected when upper case, the proportion rejected when lower case. 


135 











136 R. F. Jarrett 


fold table. A more general solution is available, and it is the purpose of 
this paper to call attention to some implications of Richardson’s solution 
and to present the general solution together with tables describing test 
efficiency under various circumstances. 

Before proceeding to the general solution, however, it is of interest to 
note some implications of Richardson’s solution and the example to which 
he applies it. It should perhaps be made explicit here that the primary 
requirement which must be met in order that any method of the type 
here considered may be used is that the criterion must be one to which 
the coefficient of variation may be legitimately applied; that is, the cri- 
terion must yield scores in equal units measured from an absolute zero. 
For it is clear that the ratio k may assume any value between unity and 
infinity by the arbitrary location of a relative zero. The method would 
thus appear to be without utility in the case of such criteria as, for ex- 
ample, ratings of superiors. This imposes a restriction upon the use of 
methods of the type under discussion, but in considering production costs, 
management is concerned chiefly with those criteria to which the method 
is applicable. 

In his illustrative example, Richardson has assumed the ratio of the 
mean output of the upper twenty-five per cent of workers to the mean 
output of the lower seventy-five per cent to be 3.5. In view of the fact 
that in very few reported cases does the ratio of best to poorest individual 
worker reach this value (3, p. 35), and in view of the general importance 
of this ratio to the determination of E, it is of interest to determine (even 
though this should require some assumption about the form of the dis- 
tribution) a general expression for this ratio. This derivation is quite 
simple under the assumption that the distribution of outputs is normal 
or can be described reasonably satisfactorily by a mutilated normal curve. 

Consider first the ratio k for a normal distribution of unit standard 
deviation and mean c. In this situation the absolute zero of production 
is c units below the mean. So long as c is about 2.6 or greater there will 
be little loss of generality in assuming the distribution to be strictly nor- 
mal with unit area. 


Let us write: 


#, = ——— = ¥, -c¢ (2) 


and the corresponding equation for #. In the above expression the 
variance of Z is assumed unity, and the bars indicate means. Thus 7; is 
the deviation of the mean of the upper P% of the cases from the general 
mean, #, is the deviation of the mean of the lower Q% (100—P)% of the 
cases from the general mean, X; and X, are the respective raw-score 
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values of those means, and c is the mean of the total distribution. We 
now write for the ratio sought: 











oe c — Zi 
| eth” 8) 
. Now, by a well-known relation, if X is normally distributed: 
2, = 2 
| and (4) 
%=-- 
q 
where z is the height of the dividing ordinate. We then write: 
c+ : 
k= - (5) 
Cc —_—— 
q 
Clearing, we obtain: 
a wzts. &. (6) 
cq—z p 


Richardson has noted the desirability of deriving formulae of the type 
he presents without making assumptions as to the form of the distri- 
butions involved, but often it is necessary to make such assumptions in 
order to obtain insight into the magnitude of the value of E to be ex- 
pected; some such assumption must be made if k is to be estimated for 
a general case, though the observed value of k may be employed where 
it is available. 

It is of interest to observe that were the distribution of worker out- 
puts essentially norma!, the value of k used by Richardson in his illus- 
trative example could not, as a matter of fact, be approached. Substi- 
tution of the appropriate values of p, q and z in (6) and assuming the 
mean of the distribution to be three standard deviations above zero 
(i.e., c = 3 or v = .33, where »v is the coefficient of variation), the value 
of k for a twenty-five per cent—seventy-five per cent division will be 
found to be only 1.65—as opposed to the value of 3.5 assumed in the ex- 
ample. For this situation the ratio of best to poorest worker would be 
very large indeed—of the order of 11 to 1—and if c be taken somewhat 
greater in order to reduce this ratio to a value more nearly in agreement 
with observed values of this ratio, the value of / becomes smaller. Sub- 
stituting the hypothetical value of k in formula (1) yields a value of 0.22 
for Z, a value which, though less impressive than the 0.60 of Richardson’s 
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example, would appear to be more nearly representative of the effective- 
ness of the selective procedure. 

Production figures—for understandable reasons—rarely find their way 
into print. A cursory search, however, revealed two sets of such data, 
and the writer has found the ratio of the mean outputs of the upper fourth 
to the lower three-fourths of employees for these two sets. 

For 99 hosiery loopers (11, p. 7) of more than one year’s experience 
the upper one-fourth of the workers averaged 21.3 units output per hour, 
while the lower three-fourths averaged 16.2 units per hour. The value of 
k is thus 1.32, somewhat less than the value for the ideal normal distri- 
bution. 

Such a selected group, however, is not typical of the employment 
situation Data are presented in the same source (11, p. 6) for 203 
loopers of varying experience. This distribution yields a k-ratio of 1.64, 
almost exactly the value which would be yielded by the normal curve. 
It may be noted that this agreement between theoretical and empirical 
values is obtained despite the fact that the distribution had a secondary 
mode in the lowest interval. 

It would not be unreasonable to suppose that in the employment 
situation the distribution of worker output might be somewhat positively 
skewed with cases tending to pile up toward the low-production end. A 
development along lines similar to the one presented above leads quite 
straight-forwardly to the following equation for the half-normal curve: 


_ pts 4g 
+" So 9. Ss p’ @) 


where Z»/2 is the height of the ordinate dividing the multilated normal 
distribution into an upper part of P% of the total area and a lower part 
of Q% (100—P)% of the area; this will be the ordinate dividing the 
total normal curve into an upper part containing P/2% and a low2r part 
containing Q/2%. The value .798 is simply double the modal ordinate. 
It is not reasonable to assume the origin of this curve to fall at the mean 
of the total distribution, for this would imply that the modal unselected 
worker produced just nothing. If c be set equal to 1 (which yields for 
the mutilated distribution approximately the same value of v assumed for 
the normal case above), however, the selection ratio, p, remaining .25, k 
for this distribution becomes 1.74, a value only slightly more favorable 
than that obtained on the assumption of normality. 

Also possible is a degree of skewness intermediate between the sym- 
metry of the full normal curve and the extreme skewness of the half- 
normal curve. The ratio k may be estimated in an equally simple manner 
for the distribution obtained by truncating the normal distribution so as . 
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to leave the upper three-fourths of the curve. In this instance, k may 
be shown to be: 
Do 3cp + 42sp/4 iW q (8) 
3cq + 1.272 — 4230/4 Pp ‘ 


where 2s,/4 is the height of the ordinate dividing the mutilated curve into 
the portions p and gq; it divides the total curve into the portions $p and 
iq. 

Assuming the same value for the selection ratio and setting c equal to 
1.77 which will maintain v = .33, k is found to be 1.73. 

In view of the nature of the selective factors at work it would appear 
idle to consider the value of k for negatively skewed distributions. 

It appears clear that the coefficient of variation is of primary impor- 
tance in the determination of k. It will be noted that for all three of the 
types of distributions considered, k is of the same order of magnitude 
so long as the coefficient of variation is maintained constant. This 
suggests that the ratio k may be relatively insensitive to the form of the 
distribution, a finding of particular interest in view of the conditions 
underlying the more general derivation to which we now turn. 

At the turn of the century Karl Pearson (5) was able to derive ex- 
pressions for the means, standard deviations, and intercorrelations ob- 
taining among several organs when the individuals possessing them had 
been selected with respect to one or more other organs. At that time he 
found it necessary to assume all distributions, selected as well as un- 
selected, to be normal or Gaussian in form, although in the light of the 
. then-recent demonstration of Yule that the various relationships involving 
the correlation coefficient could be derived without recourse to this as- 
sumption, he expressed the opinion that his findings might later be found 
not to depend upon that assumption. This opinion he himself verified 
a few years later (6), observing that “the method is absolutely independ- 
ent of Gaussian theory . . . but it does assume that linearity applies 
within the degree of useful approximation.” Much later during World 
War II, Cyril Burt (2) in England and workers in the Aviation Psychology 
Program of the U. 8. Army Air Forces (10) had occasion again to concern 
themselves with the effects of selection on intercorrelations—this time 
with correlations between psychological tests and criteria of job profi- 
ciency. Referring to Pearson’s early work, Burt (2) derived expressions 
of the influence of selection without recourse to the assumption of nor- 
mality, apparently in ignorance that Pearson himself had done so earlier. 

t is of interest to note in passing that although men working with the 
testing program of both the British and American armies found it neces- 
sary to employ the results of this early study of Pearson’s in connection 
with the correction of validity coefficients for variability of the groups 
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taking the tests, they appear not to have recognized that the results there 
provided also form the basis of a very useful criterion of test efficiency. 

It would be redundant here to repeat either the general derivation 
or that for the special case of interest. It will suffice to observe that if 
the symbols be defined as noted below, the difference between the mean 
criterion score of those individuals selected by the “aptitude test’’ and 
the general mean of the unselected group on the criterion is given by: 


F ak hy gists 
J -9=ty 7 (h £) , (9) 


where 


z mean of scores on selective device for all candidates (unselected) 

rs mean of scores on selective device of those candidates selected 

7] mean of criterion scores for all candidates (unselected) 

Ws mean criterion score of those candidates selected 

co, = standard deviation on selective device of all candidates (un- 
selected) 

co, = standard deviation of criterion scores of all candidates (un- 
selected) 

‘zy = product-moment coefficient of correlation between selective de- 
vice and criterion. 


It is desired to express the gain through the use of the test as a pro- 
portion of the mean performance of an unselected group. We thus write: 


%. a q as J o v (#: —/ =) 
re. oat! ee, Cee eee 10) 
i AGN (10) 
Writing v, for the ratio of the standard deviation of y to the mean of y 
(the coefficient of variation), (10) becomes 


B= b= Ky, (22), (11) 
The’simplicity of (11) is gratifying. This equation expresses the fact 
that 100 men selected on the basis of their high scores on a test will pro- 
duce 100Z’% more than 100 men selected without reference to their test 
score. It will be noticed that the relative increase in mean criterion score 
is directly proportional to three familiar quantities: the validity coeffi- 
cient, the coefficient of variation of the criterion, and the relative devia- 
tion of the mean test score of accepted applicants from the general test. 
mean. It should be emphasized that all these data are available im- 
mediately after the first validation study of the selective instrument. 
* Note that assuming the z-distribution to be a dichotomized normal distribution, 


substituting the value of (2, — 2)/c, obtained from normal curve theory, and solving 
for r in this equation yields a formula for the biserial correlation coefficient, 
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The appearance of the coefficient of variation of unselected criterion 
scores deserves comment at this point. It will be noted that Richardson’s 
k was found to be dependent upon the value of v and apparently insen- 
sitive to distribution form. The coefficient of variation is the variable 
(magnitude) analogue of the unselected-success-ratio of Taylor and 
Russell. The fact that v, enters (i1) as it does implies that for those 
situations in which among unselected personnel the good producers are 
relatively little more productive than the poor ones, no testing program 
may be expected to increase the efficiency of operation unless it is possible 
to take advantage of extremely rigorous selection. 

It is to be noted that no assumptions have been made as to the nature 
of any of the distributions involved; none are necessary if the psychologist 
is well acquainted with the nature of the distribution of test scores (z) 
in the population to which it will be administered as a selective device. 
This familiarity will permit him to estimate the last term of (11) with 
satisfactory accuracy for any given selection ratio. Inasmuch, however, 
as many psychological tests yield distributions (in the situations to which 
they are applied) which approximate to the normal distribution, it will 
not be amiss to employ this assumption for the purposes of illustrating 
this criterion and suggesting the magnitude of the relative increase in 
output to be expected from testing programs. 

If r., and v, are known or assumed to have certain values, the evalu- 
ation of E’ from (11) requires only a determination of (Z, — %)/o:. This 
value is obtained, on the assumption of normality, from the first equation 
of (4) where z is the height of the ordinate cutting off the upper P% 
(selection ratio) of the curve. Formula (4) has been evaluated and 
tabled for all values of P (7). These tables thus provide the information 
necessary to permit the estimation of the effects of a selection program. 

Substituting (4) in (11), we have 


EB’ = tm (02 )- (12) 


‘In 1939 Taylor and Russell (9) called attention to the fact that although the low 
validity coefficients often obtained in industrial testing situations would probably never 
permit reliable prediction of individual performance on the criterion, it is possible to 
determine with some accuracy the proportion of a selected group who would be “success- 
ful” on a criterion. They called attention to the fact that even for low validity coeffi- 
cients, if only a small proportion of candidates were selected, the proportion of those 
selected who would be “successful’”’ could become very high indeed. They introduced 
the term “selection ratio” and emphasized the necessity of taking this proportion into 
account in any interpretation of the effectiveness of a selective program. Their method 
requires definition of “successful” criterion performance in terms of what the present 
writer calls the “unselected-success-ratio,”’ or the proportion of unselected individuals 
whose criterion performance exceeds a critical value below which workers will be classi- 
fied as unsuccessful and above which they are classified as successful. 
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From (12) it appears that—for a given value of v, and p—the relative 
increase in the effectiveness of selected workers over that of unselected 
workers is directly proportional to the correlation coefficient. Thus the 
validity coefficient itself—rather than the coefficient of alienation, the 
index of forecasting efficiency, the coefficient of determination, or other 
functions of the correlation coefficient—may be considered adequately 
to reflect the benefiis to be expected of a testing program.’ It is unfor- 
tunate (in the sense of introducing complication), however, that the close- 
ness of the relation between test and criterion is not the only factor influ- 
encing the efficiency of the testing program, so that the validity coeffi- 
cient does not tell the whole story. It is of interest to note that EZ’ is 
also directly proportional to v, and that the ratio z/p is positively accel- 
erated with increasing rigorousness of selection, so that changing p from 
25% to 20% improves E’ more than the change from 50% to 45%. 

Equation (11) would seem to be preferable to (1) due to the greater 
generality of (11) resulting from the substitution for the ratio k of the 
coefficient of variation, v, which assumes but one value for all values of 
the selection ratio. In the case of k, a separate value must be computed 
for each value of the selection ratio, p, while (11) requires only the value 
of v for the determination of the values of E’ for a series of values of p. 
Both equations should be expected to yield essentially similar values of 
E for a given situation, but (11) would appear to be the more direct ap- 
proach. The demonstration of the insensitiveness of k to distribution 
form suggests, moreover, that considerable confidence may be placed in 
the simpler equation (12), even though the distribution of test scores 
is not strictly normal. 

The use of (12) would, however, be much simplified by the availability 


of tables of the value Ws which appears in parentheses in (12). For this 


purpose and to give numerical illustration of the improvement to be ex- 
pected under various circumstances, the accompanying table has been 
prepared showing for several values of the selection ratio and several 
values of the coefficient of variation, the value of one hundred times this 
function. Multiplication of the tabled values by the validity coefficient 
yields directly the per cent increase in mean productivity of selected 
versus unselected workers. As an illustration of the use of the table, 
consider the data on hosiery loopers presented earlier. For the 203 un- 
selected loopers the coefficient of variation was 0.37. Assuming a selec- 
tion ratio of 0.25 and a validity coefficient of .3, interpolation in the table 
reveals that the selected personnel will produce about 0.3 X 47% or 

5 This point has been pressed—on quite different grounds—by Johnson (4) and by 
Brogden (1). 
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about 14% more than unselected personnel. In a mass-production sys- 
tem, a 14% increase in production would appear well worth striving for, 
and it does not appear unreasonable to expect that the cost of initiating 
and maintaining the testing program will be more than paid for by such 
an increase. Formula (1) yields a value of 12.4% from these data, a 
value in good agreement, as is to be expected. 

There remains one matter to be treated before the formulas are left 
to the use of management’s test-consultants. This is the matter of the 
reliability of the predicted efficiency. Unfortunately this is not a simple 
matter, and the writer is not at present prepared to present the definitive 
answer to the problem. About the solution, however, some things seem 
clear. If the population of unselected criterion scores were perfectly 
known and the correlation between test and criterion known for the popu- 
lation, it is apparent that any particular random sample of the unselected 
population of workers would yield for a given selection ratio (substituting 
the parametric value of the correlation coefficient and the coefficient of 
variation in (11)) a random sample from the population of selected cri- 
terion scores. The standard error of the means of such random samples 
would, of course, be the ratio of the standard deviation of the population 
of selected criterion scores to the square root of the number of individuals 
selected. An approximation to the standard error of the estimated mean 
may thus be obtained from Pearson’s (5) expression for the standard de- 
viation of the selected criterion scores. If Y be considered the criterion, 
X the test, o, and oc, the standard deviations of the unselected popula- 
tion, = the standard deviation of the population of Y’s remaining after 
direct selection of X’s, s, the standard deviation of the X’s of the selected 
group, the standard deviation sought is given by the formula: 


a= o(1- (1-55) rat): (13) 


Now if the distribution of test scores be assumed normal to a first ap- 


2 
proximation, the ratio =; may be determined from the incomplete normal 


moment functions for any value of the selection ratio. The writer has 
determined this ratio for several values of p and for several values of r. 
This arithmetical work reveals that for the range of correlation coeffi- 
cients likely to be met with in practice (.1 to .5) the standard deviation 
of the selected Y’s cannot be expected to be much less than .9 the standard 
deviation of the unselected Y’s for even the most rigorous selection. It 
is therefore suggested (in view of the approximate nature of this standard 
error, due to the failure to allow for the sampling errors in r or in the va- 
rious standard deviations involved) that the standard error of the pre- 
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dicted mean output be taken as approximately equal to «,/VNp where N 
is the total number of cases in the unselected group of applicants and p 
the proportion selected. 


Summary and Conclusions 


1. Attention is called to the desirability of being able to describe the 
efficiency of a selective testing program in terms of the ratio of the mean 
production of a group selected by the program to the mean production 
of an unselected group. Attention is also called to Richardson’s ap- 
proach to the problem in terms of the relations between the point correla- 
tion and the four marginal totals of the four-fold table. 

2. A literal expression is derived (under the assumption of normality 
of distribution) for the ratio (Richardson’s k) of the mean of the upper 
P% of a group to the mean of the lower Q% [(100—P)%] of the group. 

3. Expressions for this ratio are derived for the skewed distributions 
yielded by the upper half and the upper three-fourths of a normal dis- 
tribution, and k is shown to be of the same order of magnitude for all 
three of these forms so long as the coefficient of variation is held constant. 
It is shown to be sensitive to changes in »v. 


Table 1 


The Value of the Quantity 1000 = for Various Values of p (the Selection Ratio) 
and v (the Coefficient of Variation) * 








v Selection Ratio, p 


5% 1% 20% 29% 0% DH O% W% W% W% WU 


2.06 1.76 140 1.16 O97 O80 064 050 035 0.20 0.11 
10.31 877 699 580 483 399 322 248 1.75 0.98 0.54 
20.63 17.55 14.00 1159 966 7.98 644 497 350 1.95 1.09 
41.25 35.10 28.00 23.18 19.32 15.96 1288 9.93 7.00 3.90 2.17 
61.88 52.65 41.99 34.77 28.98 23.94 19.32 14.90 10.50 5.85 3.26 
82.51 70.20 55.99 46.36 38.63 31.92 25.76 19.87 13.99 7.80 4.32 








BBSERE 





* Here v is taken simply as the ratio of standard deviation to mean. Note that since 
the quantity tabled is a linear function of v, the table may be extended to any value of v 
by multiplying the value for » = .10 by the appropriate quantity. 


4. A more general solution to the problem of the ratio of the mean 
output of a group of selected workers to the mean output of a group of 
unselected workers is presented from Pearson’s general solution to the 
problem of the effects of selection. 

5. The efficiency of a testing program so defined was shown in this 
derivation to be directly proportional to: (a) the validity coefficient; (b) 
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the coefficient of variation among the unselected criterion scores; (c) and 
to a function of the rigorousness of selection, or the selection ratio. 

6. The formula presented as the general solution ((11)) is dependent 
solely upon the assumption that regression is linear to a useful degree of 
approximation. 

7. A table is presented from which, knowing the validity coefficient, 
the selection ratio, and the coefficient of variation of the unselected cri- 
terion scores, the per cent improvement resulting from selection may be 
estimated on the assumption of normality, and evidence is presented 
which suggests that failures of this assumption such as might be met 
with in practice will affect the tabled values only slightly. 

8. Some suggestions are made as to a reasonable value for the standard 
error of the estimated mean of selected criterion scores. 


Received July 9, 1947. 
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Inter-Relationships of Selected Personnel Functions * 


Elmer R. John 
University of Minnesota 


The author was pressed by two circumstances to construct the ac- 
companying chart, and it is presumed that many personnel directors as 
well as faculty members who teach personnel administration face these 
same situations. The first was the problem of presenting a discussion 
of personnel functions to business management in a way that would make 
it clear as to why so many procedures are used in present-day personnel 
programs. Personnel directors often submit to management a list of 
procedures that they would like to put into effect, but they use lengthy 
and confusing verbal explanations to justify the items in the list. Fre- 
quently the specialized terminology is new to top management, and the 
verbose explanations that accompany them all too often stimulate re- 
sistance to their acceptance. A clear outline of how personnel practices 
are related to each other and to the over-all objectives of the organization 
is what appears to be needed. Management says, ““You’re the experts. 
Show us clearly what should be done and why.” This chart was made 
primarily as partial answer to this question, but it might also be useful 
to personnel managers who are establishing new programs and who want 
an outline of the connections between various phases of a functioning 
personnel department. 

The second need for this chart was encountered in the teaching of a 
university course in personnel psychology. One important objective is 
to teach principles, methods, and facts related to job analysis, job evalua- 
tion, selection and placement, merit rating, etc. It is quite another task 
to convey to students a clear understanding of the inter-relationships of 
all these procedures. In this connection, it was found that, when a se- 
lected list of personnel functions is given to a class and the students are 
asked to produce their own charts of the inter-relationships, the project 
stimulates student participation in the discussion and leads to better 
understanding of the topic. 

The accompanying chart (Figure 1) is a modification of the original 
that was used in conference with management. New and more detailed 

* Midland Cooperative Wholesale, Minneapolis, Minnesota, for whom the writer 
prepared this chart encouraged and sponsored the early publication of this paper. 
The author acknowledges indebtedness to Donald G. Paterson, Editor, and to Philip H. 
Kriedt, University of Minnesota for helpful suggestions and criticisms. 

146 





‘suorjouny jouUosied poxoores jo sdiysuorefel-s9}UI BuLMoys FABYD “| “Ol 











Sariauas eekojdw3 buryinssay 




















01,0448 1UI LU py 
oben 


£2 





OE 











Buyey suey mee Sooke uowenjerg 






































x 
22} 




















Buiesunoy | M@tAs24U] 


Aaaans 22.09) aahojdu3 13 
































“i 
Sa y 
wayshs wes6ou uo14esifi20d 
Genuieat sishjeuy gor . eet . 


: 
7 we 




































































aioe es v pu2Wa2reg 
twesbo1g Ayases PUe U014>29/a5 


~n 
§ 
3 
= 
3 
i< + 
= 
od 
: 
A, 
: 
a 
i) 
n 
a> 
F 
= 
& 
§ 
= 
— 




















( Ssdiysuouejas 4294'—P ssa, MOYS Sdut; pryog) 
bunuebieg anurejop pue suoyesay s0gey buspryIx7 
SUOIJIUNY j@UUOSIaG P2}22/95 jfO sdiysuOljejai-s24U| 

















Elmer R. John 


relationships are suggested by each person who looks at it critically, but 
on certain other relationships there is consistent agreement. Since clurity 
is the underlying purpose of making the chart at all, the number of lines 
on it have been kept to a minimum. Many more lines of indirect and 
tenuous relations could still be shown. Pigors and Myers, for example, 
say that job analysis is related to “selection and placement, training, 
transfer, upgrading, and promotion, and in making wage surveys, . . . 
safety program, and as a partial basis for time studies in connection with 
wage-incentive plans’ (1, p. 220-221). This summary statement is per- 
fectly true. Likewise, numerous similar possibilities of fine inter-relation- 
ships exist between other functions than just job analysis; but to draw 
lines for all of these direct and indirect connections would complicate the 
diagram to the point of uselessness. Consequently, only the more ob- 
vious and direct relationships are represented by connecting lines. 

Upon first inspection, it may appear that many of the relationships 
mentioned by Pigors and Myers above are not indicated here. This 
chart, however, shows that job analysis is related to promotion and trans- 
fer; but the connection is shown as going by way of job evaluation rather 
than directly. In the same way, job analysis is related to selection 
through job specifications. By this method, many relationships can be 
traced which might otherwise seem to the reader to have been omitted 
from the diagram. 

The selected functions have been taken from Yoder (5) and Pigors 
and Myers (1), but they can be found, at least in part, in almost any text- 
book on personnel administration.! The diagram is not designed to give 
new facts, but merely to show, graphically, the relationships that do exist 
and which have been described by authors in this field. To take just a 
few instances: Line No. 3 of the diagram (connecting Selection and Place- 
ment with Training Program) is explained by Yoder, “A second prin- 
ciple (in establishing any effective training program) is involved in the 
selection of those who are to be trained” (5, p. 242). Lines No. 2 and 
No. 6 are spoken of by Shartle when, in discussing job specifications, he 
says, ‘Here the items from the job analysis report have been edited for 
use in the company employment office” (3, p. 46). Line No. 7 is included 
so as to show that, ‘Determination of training needs . . . can be effect- 
ively accomplished only in terms of job analysis and records maintained 
by the personnel division” (5, p. 219). Pigors and Myers verbalize Lines 
No. 10 and No. 24 as follows: “Before attempting to evaluate jobs, how- 
ever, it is necessary to know what a worker does on each specific job. . . . 

1 Readers familiar with elaborate charts of personnel functions, such as (Figures 8 


and 34) in Scott, Clothier et al. (2, pp. 32 and 138), may see a decided similarity. Their 
diagram, however, is tied in more with the traditional organization type of chart. 
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This is job description and analysis . . .” (1, p. 220); “With a well-ad- 
ministered employee-rating plan, management is in a better position to 
develop a sound promotion policy” (1, p. 174). These quotations are 
but a few examples from the literature which imply or express the inter- 
relationships pictured by the lines in Figure 1. 

The solid lines indicate that one function is basic to or useful in the 
administration of another. The arrows point to the procedures that are 
relatively dependent on others. When a reciprocal connection exists, the 
arrows point in both directions. Dotted lines are used to show the more 
tenuous relationships. In cases where the over-all conditions in the or- 
ganization are affected by a certain procedure, or where a procedure could 
be related directly to most of the others (as in a Suggestion System), 
this broad relationship is summarized by an arrow to the arc at the right 
which symbolizes production, efficiency, and general morale. 

Detailed explanation of each line of relationship is not included here 
because they are so much more fully described in the pages of any ade- 
quate textbook on personnel administration. 


Received January, 1948. 
Early publication. 
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The Effect of Smoking on Tremor 


A. S. Edwards 
University of Georgia 


Many studies on the effects of smoking have been made but without 
as accurate measurement as is desired—at least in some of the experi- 
ments. Probably the best summary of results of smoking is that of 
Hull. As he says, the measurement of the effects of smoking on tremor 
has been unsatisfactory. It has continued to be unsatisfactory; and the 
following experiments have been made to give accurate results in a field 
of growing importance since smoking has become so common. Prelimi- 
nary trials showed that with the finger tromometer, the effects on finger 
tremor could be detected and measured immediately after the smoking of 
half a cigarette. 

Does smoking increase finger tremor? Is it true, as some students 
argue, that they should not be required to go through two or three hour 
examinations without being permitted to smoke? The following report 
gives at least a partial answer to these and other questions about smoking. 
They give some information on a matter generally not understood, 
namely, effects of smoking with inhaling as contrasted with smoking and 
not inhaling. The experiments include finger tremor and smoking one- 
half a cigarette; taking eight puffs on a cigarette in one minute; smoking 
with and without inhaling; effect of elimination of smoking for two hours; 
smoking “‘denicotinized” cigarettes; smoking corn silk with and without 
inhaling; and breathing (but not smoking) in a smoke filled room. 

In contrast to the significant results to be reported below in connection 
with finger tremor, it may be mentioned that comparatively slight results 
have been found in our experiments upon the effect of smoking on body 
sway. Results of three experiments have already been reported? and 
another unpublished experiment has recently been done with 100 Ss, col- 
lege students between the ages of 18-30, including 50 smokers and 50 
non-smokers. The results corroborated our earlier findings. Only in 
exceptional cases were there increases in body sway; and statistical study 
of the cases showed no differences before and after smoking with an ac- 
ceptable critical ratio. This does not mean that there was no effect of 

1 Hull, C. L. The influence of tobacco smoking on mental and motor efficiency. 
Psychol. Monogr., 1924, 33, No. 3, Whole No. 150, 1-161. 
2 Edwards, A.S. The measurement of static ataxia. Amer. J. Psychol., 1942, 55, 


181. 
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smoking on body sway; but, save for the exceptional cases, the effects 
were so small that they stand in sharp contrast with the immediate and 
significant results of smoking on finger tremor which are reported below. 


Experiments with the Tromometer 


Apparatus. The author’s finger tromometer* was used throughout 
the series of experiments. This apparatus permits a tridimensional 
measurement of finger movement, front-back, right-left, and up-down. 
The sum of the three measurements in millimeters was used for statistical 
purposes. 

Procedure. Ss were asked to come into the examining room and make 
themselves as comfortable as possible. They were asked to sit in the 
chair which was placed in such a position that the extended arm pointed 
directly towards one unit of the apparatus and was at right angles to the 
second unit, the front of the third unit being directly above the finger 
lcop. 8S was told that the object of the experiment was to measure the 
amount of finger movement. The middle finger of the right hand was 
placed in the loop which was drawn fairly tight at the base of the finger 
nail. The standard position for the arm was extended but with elbow. 
just slightly bent so as to relieve undue strain. Ss were required to rest 
three to five minutes at least before measurements began in order to 
elininate tremor that might have been caused by activities before coming 
into the experimental room. During the measurements they were not 
permitted to talk. 

Instructions. S was seated and told to lean back and be comfortable 
with both feet on the floor, and not to talk. The standard instruction 
was given: “Try to be as still as possible.” 

Subjects. In all experiments, except where otherwise noted, there 
were 100 college students, selected at random, aged 16 to 35. 


Experiment 1. Effect of Smoking One-half a Cigarette 


Subjects. In this experiment there were 50 smokers and 50 non- 
smokers, aged 16-35. There were 32 men and 68 women. 

Each S was was measured first as a control, then permitted to smoke 
one-half a cigarette which was marked at the middle with a pencil mark. 
Then S was immediately measured for results of smoking on finger tremor. 

Results. Immediately a difference was found in finger tremor between 
the control and experimental measurements. For non-smokers finger 
tremor rose after smoking from 31.2 mm. to 36.8 mm. on the average, an 
increase of 18 per cent, but with a C. R. of only 0.5. For the smokers the 
finger tremor increased from an average of 48 mm. to an average of 67 


* Edwards, A.S. The finger tromometer. Amer. J. Psychol., 1946, 59, 273-283. 
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mm., an increase of 39 per cent, with a C. R. of 2.7. This is twice as 
much increase for smokers as for non-smokers. Similar results were 
found when comparing men with women, and the increase for women was 
considerably more than the increase for men. As has been found before, 
men had more finger tremor than women with one exception, namely, 
among the smokers the women had slightly more finger tremor after 
smoking. 


Experiment 2. Eight Puffs on a Cigarette 


The above experiment controlled the amount of a cigarette which was 
consumed. This experiment determined the number of puffs in one 
minute. 

Subjects. There were 108 Ss college students, aged 17-25, 28 non- 
smokers and 80 smokers. Of the non-smokers 14 were men and 14 
women; of the habitual smokers 40 were men and 40 were women. All 
were chosen at random, except for the effort to get both non-smokers and 
habitual smokers, and an equal number of men and women. 

Results. Again the non-smokers showed less difference, the control 
series before smoking giving a mean of 26.7 and the mean after smoking 
44.4 mm., the C. R. being only 1.96. There was large variability in- 
dicated both by the SD and the fact that there was much less difference 
between medians than between means. 

With the smokers, however, all of whom probably inhaled, the con- 
trol series gave an average of 35.5 mm., the experimental series a mean 
of 61.5 mm., an increase of 84 per cent, and there was a C. R. between 
the means of 5.8. The medians also showed a similar difference, 24.0 
before and 51.5 after smoking. For the habitual smokers, the 40 men’s 
averages showed more tremor than those for the 40 women both before 
and after smoking, although the increase after smoking was for the men 
78 per cent, and for the women 82 per cent. All differences were statis- 
tically significant. For 11 Ss non-smokers who inhaled there was an in- 
crease in tremor after smoking and inhaling from 32 to 73.6 mm., an in- 
crease of 129 per cent. For 7 of these who had the greatest increase the 
average before smoking was 28.9 mm., and after smoking 107.6 mm., 
an increase in tremor of 272.3 per cent. This may be contrasted with the 
non-smokers who did not inhale who showed an increase of only 9.9 per 
cent after smoking. 

A problem was raised as to why smokers were affected more by 
smoking than were non-smokers. It was possible that increased finger 
tremor among the smokers was due to a cumulative effect; it was also 
possible that the differences were due primarily to inhaling. All habitual 
smokers, or nearly all, inhale. Non-smokers generally do not inhale and 
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when they attempt to for the first time during experimental series, they 
are likely to have a very disagreeable experience. 

These results plainly called for an experiment definitely set up for the 
study of smoking with and without inhaling. 


Experiment 3. Inhaling vs. Not Inhaling 


The following experiment was made for the purpose of comparison of 
the effect of smoking with and without inhaling. In this series there 
were 22 Ss who smoked and inhaled and 8 controls who smoked, but did 
not inhale. . 

Subjects. Twenty-two college students, chosen at random, who in- 
haled and who were willing to be tested during a 214 hour period; 8 con- 
trols, college students who did not inhale and who were willing to be tested 
through a 244 hour period. All were selected at random with the excep- 
tion of these two conditions. Their ages were 17-25. The Ss were men 
with the exception of two women in the experimental and three women 
in the control group. 

Results. The results of this experiment are reported for four of the 
periods of testing, averages, before smoking, after smoking, after no smok- 
ing for two hours, and after smoking again following the two hour period 
of nosmoking. The averages for the 8 controls who did no inhaling were 
as follows for these 4 periods: 30.8, 26.9, 27.8, 27.2. For those who in- 
haled, 22 Ss, the averages were 28.1, 51.1, 32.2, 53.9. Here is a very 
definite and significant difference. For those who inhaled there was a 
great increase in finger tremor; for those who did not inhale, small and 
statistically insignificant differences appeared before and after smoking, 
and again after withdrawal before and after smoking. 

Individual Cases. With controls not inhaling there were insignificant 
changes. In sharp contrast to this the Ss studied through 10 to 60 min- 
utes after smoking and inhaling, measured at from 3 to 10 minute inter- 
vals, showed the following results for individual cases: Before smoking 
26; after smoking and inhaling 36, 46, 43, 61, 74.7. The S having the 
smallest finger tremor showed only 5 mm., before smoking, but after 
smoking half a cigarette went up to 49.0 mm. and reported that the doctor 
had told her she must stop smoking. Another S before smoking had a 
finger tremor of 35 mm.; immediately after smoking it was 44.7, three 
minutes later 58.6. One S had before smoking, 14.3 mm.; after smoking, 
a tremor of 31, 49, 31, 33, and 31, with continued smoking of two cigarettes. 
Another §, a college athlete in the pink of condition, had before smoking 
a finger tremor of 30.3. Following thefsmoking and inhaling his finger 
tremor was 61 and after three minutes, 74.7. 

Other Ss were tried before smoking, after smoking without inhaling, 
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and then smoking with inhaling. Examples of results follow: 44, 20, 63; 
19, 18, 57; 24, 23, 44. 

The trials in this series include not only smoking cigarettes, but 
»moking cigar and pipe. Results for smoking of pipe for the control who 
did not inhale are as follows: control 24.3, 22.6, 23.3, 18.3, 22. S who 
smoked and inhaled pipe smoke showed before smoking 24.0, after, 59.7, 
61.7, 89.0, 84.3, 77, and 59. Another pipe smoker who inhaled began 
with a tremor before smoking of 34.5. After smoking pipe and inhaling 
his tremor was 55, 60, 45.3 at 10 minute intervals. Arother pipe smoker 
who inhaled started before smoking with a finger termor of 30; after 
smoking and inhaling the tremor was 72, 64, 63. 

The following example may be given for cigar smoking with and with- 
out inhaling. The control who did not inhale had before smoking an 
average of 13.6 and after smoking 13.6, 12 and 10. The S who inhaled 
cigar smoke had before smoking an average of 9.7 and after smoking 32.3, 
34.6, and 26. 

Both the reports of the group with group controls, and the individual 
cases with controls showed large and significant differences in finger 
tremor after inhaling. No significant differences were found following 
smoking without inhaling. 


Experiment 4. Effects of Withdrawal of Smoking 


Another experiment was made with 100 Ss, to check the results of the 
foregoing to discover effects of withdrawal and especially to determine 
the statistical significance of results. 

Subjects. College students, selected at random, 50 men and 50 
women, all of whom were habitual smokers, aged 17 to 33. These were 
chosen upon their willingness to go without smoking for a period of two 
hours. The time of withdrawal of smoking was actually between 2 and 
24% hours. They were not watched during the 2 to 24 hours, but most of 
them were students in the Department of Psychology, interested in the 
experiment, and their word was accepted that they had abstained for 
two hours. 

Results. For all Ss the following results appeared in the following 
order: Before smoking, after smoking, after abstaining from smoking for 
at least two hours and after smoking again after the withdrawal period: 
45.4, 58.0, 40.9, 58.6. All C. R.’s were significant, the C. R.’s between 
the averages before and after withdrawal being 4.44. It is thus apparent 
that for 100 unselected smokers a period of two or more hours withdrawal 
gives a significant decrease in finger tremor. For the women the re- 
duction in finger tremor was somewhat greater than for the men. In all 
cases the curve shows a rise after smoking before the withdrawal, a sig- 
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nificant decrease following the withdrawal period, and a significant in- 
crease again after smoking. From this experiment it appears clear that 
withdrawal on the whole reduces finger tremor. The comments of the 
students indicated that many of them felt as well or better than usual 
and that they were somewhat surprised at the results. 

A brief summary of some important points from these experiments 
is shown in Table 1 and 2. 


Table 1 


Results in mm. of the Comparison of Non-Smokers and Smokers Before and After 
Smoking One-half a Cigarette; and Before and After Taking 
8 Puffs on a Cigarette in One Minute 








Before After* Increase 





§ cigarette: 50 non-smokers 31.2 36.8 18% 
50 smokers 48.0 67.0 39% 


8 puffs: 28 non-smokers 26.7 44.4 65% 
80 smokers 35.5 61.5 84% 





Table 2 


Effect (in mm) of Withdrawal of Smoking * 
Note: Results are arithmetical means before and after smoking; and after 2 hours 
without smoking; and immediately after smoking again following the withdrawal. 








Withdrawal Smoking 
Before After 2 Hours Again 





8 controls, no inhaling 30.8 26.9 27.8 27.2 
22 smokers, inhaling 28.1 51.1 32.2 53.9 


100 smokers, inhaling 45.4 58.0 40.9 58.6 





* All C. R.’s for the 22 Inhalers and the 100 Inhalers were significant, the C. R.’s 
before and after withdrawal being more than 4. 


The reports of the Ss after withdrawal did not indicate any increase ‘ 
in nervousness and many of them said that evidently they simply thought 
they needed a smoke, but actually felt as well or better after the two-hour 
period of no smoking. Some of them expressed surprise at the results 
which were shown them. So far as our results go there is at least no 
suggestion that students cannot go through a 2 to 2% hour period of 
examination without smoking. There are, of course, other factors to 
consider; the habit of smoking and the urge to smoke may be disturbing 
when one finds oneself in a situation ~vith a strong urge to smoke, but not 
permitted to doso. The question, .u.ever, may be raised as to whether 
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such a person is not smoking excessively, and needs to reduce his smoking 
rather than to be permitted to indulge at any and all times. No evidence 
was found, but perhaps in too limited a number of cases, to show that 
smokers who do not inhale were inconvenienced by not being permitted 
to smoke for from two to 24% hours. The two last series of experiments 
suggest that the greatest finger tremor, or for that matter, practically all 
increase of finger tremor, is found in those cases who smoke and inhale, 
and that the withdrawai of smoking from 2 to 24% hours does not increase 
finger tremor, except in a few cases and that very little. From the statis- 
tical results and the comments it seems to appear that withdrawal is 
beneficial. 


Experiment 6. Smoking ‘“‘Denicotinized” Cigarettes 


Our results so far have been with smoking that involved nicotine; the 
two following experiments were made for the purpose of comparing these 
results with smoking which did not involve nicotine. One experiment 
was with so-called ‘‘denicotinized” cigarettes. The extent to which the 
nicotine has been removed is not known. The second experiment was 
with corn silk in which there was not any nicotine. 

With the same apparatus and standard procedure a series of experi- 
ments was made with Ss who had already been used in the experiments 
involving nicotine. With 10 Ss there were no distinguishable results 
found when tremor following the smoking of nationally advertised denico- 
tinized cigarettes was compared with our results reported in other ex- 
periments. Since the results definitely duplicated those already found 
with standard cigarettes, the experiment was discontinued. 


Experiment 5. Smoking Corn Silk 


In order to be sure that there was no nicotine present dried corn silk 
was obtained from the University farm and was smoked by a number of 
our former Ss. 

Subjects. Four college students, one woman and three men, selected 
because of their interest in the experiment. It was found necessary to 
smoke the corn silk in pipes since it appeared to be very difficult to make 
cigarettes. The four students above mentioned were experimental Ss 
who inhaled, but the controls did not inhale. 

Results. Very clearly it appeared that there were no significant 
changes in tremor while smoking corn silk, either in the experimental or 
control Ss. Several pipefuls were smoked, and when no results appeared, 
the experiment was finished with the smoking of a standard brand of to- 
bacco cigarette. Immediately the finger tremor increased as is shown in 
our other experiments, with Ss who inhaled. The smoking of the corn 
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silk was continued nearly an hour in all cases, and yet no differences in 
finger tremor were apparent. This was in sharp contrast to the im- 
mediate rise when the Ss who inhaled finished the experiment with the 
standard brand cigarette and so-called denicotinized cigarettes. 


Experiments 6 and 7. Effect of Breathing Cigarette Smoke 


One further question was raised, namely, to what extent would breath- 
ing cigarette smoke in a smoke filled room affect finger tremor. It is re- 
lated to the question of the effect of breathing in busses, trains or rooms 
where people are smoking. 

Two advanced students attempted, under the direction of the writer, 
to find the answer to this question. Two rooms were required for this 
experiment: one thoroughly ventilated and without any tobacco smoke 
for the control measurements; the other, a room of approximately the 
size of a fairly large bus, except that the ceiling was somewhat higher. 
The control room was used before the S was taken into the smoke filled 
room. After control measurements were taken, S was taken into the 
smoke filled room and measured at the end of 3, 6, and 9 minutes after 
breathing the air of this room. Ss did not smoke throughout the ex- 
periment. 

There was no measurement of the actual amount of smoke in the 
experimental room but in the two experiments there was more smoke than 
is ordinarily found in so-called smoke filled rooms. In one experiment 
the Es smoked and had cigarettes burning without smoking so that there 
was definitely as much or more smoke than is found in the ordinary 
smoke-filled room. In the other experiment, the room was filled with 
cigarette smoke so that Ss eyes were affected and many Ss indicated dis- 
tinct discomfort. The second condition was extreme and much worse 
than would be found in any smoking room, bus or train. 

Apparatus. The writer’s finger tromometer was used. 

Subjects. There were 80 Ss, 40 men and 40 women in the two ex- 
periments; half of the men and half of the women were used in each ex- 
periment. 

Procedure. Standard procedure was used in both experiments. First 
the Ss were required to rest from 3 to 5 minutes and the control measure- 
ments were made. They were then taken into the smoke-filled room and 
measured at intervals of 3, 6, and 9 minutes. 

Results. Although disagreeable feelings were reported in both ex- 
periments, especially the one with the greater amount of smoke, no sig- 
nificant results in finger tremor were found in either experiment. For 
all means and medians the differences were not greater than4dmm. This 
is quite insignificant. So far as these experiments go, therefore, Ss who 
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do not smoke during the period of the experiment, but who breathed air 
abnormally filled with cigarette smoke, show no significant increases in 
finger tremor. 


Conclusions 


1. Our experiments seem to show that smoking without inhaling has a 
small, or negligible, effect upon finger tremor. 

2. In contrast to this the smoking of standard tobacco cigarettes with 
inhaling has been followed immediately, even during the smoking of the 
first cigarette, by a large and significant rise in finger tremor. 

3. Withdrawal of smoking for two to two and one-half hours has 
shown large and significant decreases in finger tremor for Ss who inhale. 
Following the withdrawal period and after smoking again with inhaling, 
finger tremor has increased greatly. 

4. No differences from the above conclusions can be found with the 
use of so-called “‘denicotinized” cigarettes. When smoking these ciga- 
rettes and inhaling, large and significant increase in finger tremor has 
appeared. 

5. Experiments with cigars and pipe smoking have shown the same 
results as the smoking of standard cigarettes, namely, no increase in finger 
tremor, or a negligible amount, with the Ss who did not inhale; large in- 
‘ creases in finger tremor when Ss did inhale. 

6. The results are very different with the smoking of dried corn silk. 
No increase of finger tremor, or very little, was found with the smoking 
of corn silk which was continued for one hour, even though the Ss inhaled. 
The results of the experimental and control groups were practically the 
same. ‘There was no increase of finger tremor for either. 

7. Breathing cigarette smoke in amounts equal to and greater than 
that commonly found in smoke filled rooms, busses and trains, does not 
appear to be followed by any increase in finger tremor. 

8. So far as our experiments go, it appears that increase in finger 
tremor is related especially to the use of nicotine and inhaling. 


Received October 6, 1947. 





Factors in the Design of Clock Dials Which Affect Speed and 
Accuracy of Reading in the 2400-Hour Time System 


Walter F. Grether 
Aero Medical Laboratory, Wright Field; Ohio 


People commonly experience difficulty in using the 2400-hour time 
system which has become standard in military practice. When time is 
read from a 12-hour dial, it is necessary to add 1200 to all readings after 
12:00 A.M. The mental arithmetic thus required introduces an oppor- 
tunity for error, and also some delay in obtaining the desired figure. On 
the other hand, 24-hour dials designed to give readings directly in 2400- 
hour time are, at first glance, quite confusing to persons who have spent 
their entire life reading time from 12-hour clocks. In the 24-hour dial, 
only one of the hourly positions can appear in its conventional location. 
In addition, an interval of one hour on the hour scale corresponds to 24% 
instead of 5 minutes on the minute scale. One of the major purposes of 
the present experiment was to find out whether the 12-hour or 24-hour 


dial can be read more easily when readings are required in 2400-hour time. 
A further purpose was to evaluate a number of the possible factors in the 
design of either type of dial which might influence speed and accuracy 
with which readings are obtained. 

The clock dial designs used in the experiment were selected in order 
to make possible a comparison of the following variables: 


1. A 12-hour vs. a 24-hour dial. 
2. Use of numerals vs. no numerals on the minute scale. 
3. Use of 1-minute vs. 5-minute graduations on the minute scale. 
4. Use of numerals at all hourly positions vs. replacement of some 
numerals with mere reference marks. 
. Addition of a 13 to 24-hour scale to the 12-hour dial vs. no such 
scale. 
. Placement of the 24-hour position at the top vs. the bottom of a 
24-hour dial. 
. Placement of the 60-minute position at the top vs. the bottom of 
a 24-hour dial. 


Experimental Procedure 


For the purposes of this experiment, 11 different designs of clock dials 
were prepared. A sample of each of these designs, considerably reduced 
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in size, is presented in Figure 1. Five of these clocks, types A through E, 
are variations of the 12-hour clock. The remaining six are variations of 
the 24-hour clock. Mock-ups of the 11 dials were prepared with movable 
hands. These were then photographed with the hands in different posi- 
tions to make up the actual items of a printed test. This test was made 
up in two parts. In Part I there were 10 reproductions of each clock 
face. The different dial designs were intermingled in a predetermined 
irregular sequence so that the subject was required to change from one 
dial to another as he worked on the successive items of the test. A time 
limit was used for the entire 110 intermingled items, and thus no speed 
data could be obtained for any individual dial design. Part II of the 
test was prepared with 10 reproductions of each dial presented succes- 
sively, thus making possible the use of a time limit for each of the 11 de- 
signs presented. In Part II, therefore, both accuracy and speed data 
could be obtained. 

In preparing the clock reading test precautions were taken to equalize 
all factors which might contribute to reading difficulty except for those 
factors which were being studied. All dials were 2.25 inches in outer 
diameter. All numerals and graduation marks of comparable meaning 
were of the same dimensions on all dials and were sufficiently large to 
avoid any problems of visibility. Most of the dials to be compared 
directly with one another differed in only one characteristic, so that any 
difference in speed and accuracy of reading could be attributed to this 
one difference. As shown in Figure 1, an A.M. or P.M. beside each clock 
indicated to the subject whether the time shown was before or after 
12:00 noon. 

Additional precautions were taken to equalize the inherent difficulties 
of the time settings on the different clock designs. It was assumed, for 
example, that A.M. readings would be less difficult than P.M. readings, 
and that readings would be easier when the minute hand is on a five 
minute graduation mark than when it is on an intermediate position. 
For this reason, all clock dial designs were equalized with respect to the 
number of A.M. and P.M. readings, number of readings at 5-minute posi- 
tions, average magnitude of minute readings, and number of hour readings 
at major positions (i.e., 3, 9, 12, 15, etc.) In determining the sequence 
in which the test items appeared in Part I of the test, precautions were 
taken to insure that there was no grouping of items involving a particular 
clock near the beginning or end of the test. The actual items in Part II 
of the test used different time settings from those in Part I. Subjects 
were instructed to read the clocks to an accuracy of one minute, and there 
were no settings of the minute hand between one-minute graduations. 

This test was administered to 62 rated military personnel (pilots, 
navigators, and bombardiers) at Wright Field and to 100 advanced 
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mathematics students in a Dayton high school. All subjects took Part I 
of the test prior to Part II. In taking PartII of the test, however, ap- 
proximately one-half of the subjects began at the front of the test booklet. 
The remaining one-half of each group took Part II of the test in reverse 
order. That is, they first completed the 10 items for the last dial design 
in the booklet in the order in which they appeared, then those for the 
second from the last dial design in the booklet, etc. For the rated mili- 
tary personnel, a time limit of 15 minutes was used on Part I of the test 
and a time limit of 45 seconds on each section of Part II of the test. For 
the high school students, a 19-minute time limit was used for Part I and 
a 1-minute time limit for each section of Part IT. 


Table 1 
Per cent Errors and Time per Reading for Eleven Experimental Clock Dials * 























Rated Military Personnel High School Students 
N = 62 N = 100 
Part I Part II Part I Part II 

Clock Percent Percent Sec. per Percent Percent Sec. per 
type errors errors reading errors errors reading 

A 7.2 6.4 5.28 12.2 11.8 7.52 

B 5.6 7.1 5.39 8.6 13.8 7.88 

Cc 19.0 19.1 5.61 27.4 23.5 8.55 

D 8.7 13.3 5.69 13.0 20.9 8.24 

E 7.3 14.5 5.34 13.0 23.9 8.10 

F 7.4 8.0 4.93 13.3 15.6 6.90 

G 4.2 6.8 4.79 6.1 8.4 6.56 

H 10.8 17.3 5.40 15.3 22.8 7.95 

I 12.8 17.7 5.45 19.6 29.5 7.79 

J 7.7 3.6 5.02 14.7 4.9 6.86 

K 42.8 14.7 5.64 35.9 14.2 7.82 





* Significance of differences. 
When the average error score for two clocks being compared is 5% 10% 2% 
The results can be considered significant (1 per cent level of 
confidence) if the difference between clocks is equal to or 


greater than the following: 
For rated personnel 3.5% 4.7% 6.2% 
For high school students 33% 40% 54% 


The results for time per clock reading can be considered sig- 
nificant (1 per cent level of confidence) if the differences 
between clocks are equal to or greater than the following: 

For rated personnel .20 sec. 

For high school students 31 sec. 








Speed and Accuracy of Reading Clock Dials 


Results 


A summary of the major results of this experiment is presented in 
Table I, which shows the per cent errors (of one or more minutes) on each 
clock, for both parts of the test, and for both groups of subjects. The 
table also shows the time per clock reading in seconds for Part II of the 
test, for both groups of subjects. At the bottom of the table are shown 


CLOCK 


TYPE PERCENT ERRORS 
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Fic. 2. Per cent errors in military time readings on eleven experimental 
clock dials (part I). 


the estimated differences required for significance at the 1 per cent level. 
Differences in the results for any two clocks which are equal to or greater 
than the differences at the bottom of the table, may be assumed to be 
genuine differences and not the result of chance factors. 

The results presented in Table 1 are also presented in the form of bar 
diagrams in Figures 2, 3 and 4. It will be noted in Table I and Figures 
2, 3 and 4 that the data for high school students and rated personnel 
present substantially the same overall picture. In general, also, the dif- 
ferences which appeared among the dials in Part I of the test reappear 
in Part II. Thus, although many of the differences between dials on one 
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part of the test and for one group of subjects are not significant, the fact 
that the differeaces are in the same direction in Part II and for the other 
group of subjects greatly increases the likelihood that the differences are 
significant. 

In general, accuracy of readings was somewhat lower in Part II of the 
test, even though successive items were similar, probably because the 
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Fic. 3. Per cent errors in military time readings on eleven experimental 
clock dials (part IT). 


timing of individual sections motivated the subjects to work at greater 


In Table 2, an analysis is presented of the various types of error made 
on the different clock dials in Part II of the test. Most of the errors 
were 1 minute, 5 minutes, 1 hour or 12 hours in magnitude. The fre- 
quency of each of these types of error is shown for each dial for both the 
military personnel and the high school students. 

Table 3 presents a number of product-moment correlation coefficients 
which aid in an evaluation of the experimental method used in this in- 
vestigation. The correlations between speed and accuracy of individuals 








Speed and Accuracy of Reading Clock Dials 165 


are low but positive for both rated military personnel and high school 
students, although the correlation for military personnel is not significant. 
This result indicates that the more rapid individuals tend also to be more 
accurate. The correlations between speed and accuracy for the different 
dials are high and positive for both groups of subjects, indicating that 
the dials which can be read with the greatest speed can also be read with 
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Fic. 4. Speed of military time readings on eleven experimental 
clock dials (part IT). 


the greatest accuracy. The correlations between errors on Parts I and 
II of the test are positive but not significant for both groups of subjects. 
This would seem to indicate that the results depend to a considerable 
extent upon whether the individual dial designs are intermingled as in 
Part I or grouped as in Part II. The final correlations, between the re- 
sults for the two groups of subjects, are positive and very high, indicating 
that the relationships among the results for the different dials were quite 
independent of the experience of the subjects with military time, 
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Table 2 


Frequency of Several Types of Error in Reading of Eleven Experimental 
Clock Dials (Part II of Test) 








Rated Military Personnel High School Students 
N = 62 N = 100 
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Table 3 
Correlations Among Several Variables in Experiment on Design of Clock Dials 








Correlation Significance 
r Level 


Variables 


Relation between speed and accuracy of individuals 
Per cent errors vs. number of items omitted (Part IT) 
N = 62 Rated military personnel . Not sig. 
N = 100 High school students ‘ 1% 
Relation between speed and accuracy for different dials 
Per cent errors vs. seconds per reading (Part IT) 
N = 11 clock dial designs 
For rated military personnel 
For high school students 
Relation between results on two parts of test 
Per cent errors on Part I vs. per cent errors on Part II 
N = 11 clock dial designs 
For rated military personnel 
For high school students 
Relation between results for two groups of subjects 
Rated military personnel vs. high school students 
N = 11 clock dial designs 
Per cent errors on Part I 
Per cent errors on Part II 
Seconds per reading on Part IT 
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Fvaluation of Results with Regard to Experimental Method 


1. Lack of control of either the time or speed variable. In this experiment 
speed and accuracy were allowed to vary independently. The positive 
correlations between speed and accuracy of individuals (Table 3) show 
that the individuals who achieved the greatest accuracy did not do so at 
the expense of increased time. More impotrant for this investigation, 
however, are the relations between speed and accuracy of the 11 different 
dials. The high correlations indicate that those dials which can be read 
most accurately are also read most quickly, and thus the same conclusions 
are reached whether speed or accuracy is taken as the criterion. Had 
time per test item been equalized for all dials, it is probable that the error 
differences among the several dials would have been accentuated but not 
changed in direction. It is believed that the procedure of allowing speed 
and accuracy to vary independently has value in that it provides two 
criteria for evaluating the design variables under investigation. 

2. Intermingling versus grouping of different dial designs. In Part I 
of the clock reading tests the various dial designs were intermingled in a 
random fashion so that the subject was unable to adjust to a particular 
design. In Part II each design was presented in a separately timed 
section of the test booklet. The correlations between errors in Parts I 
and II (Table 3) for the different dial designs are positive but below the 
5% level of confidence. It is concluded from this that the difference in 
method did significantly affect the results. The greatest effect appeared 
in dial design K which had the minute scale rotated 180 degrees from its 
conventional location. In Part II of the test the subjects were able to 
adjust to this arrangement and avoid the high percentage of errors made 
in Part I. The test method in Part II probably simulates more closely 
the clock reading situation in real life, particularly for the aircraft pilot 
or navigator, since he repeatedly reads time from the same instrument. 
There is an additional argument in favor of Part II, namely, that it 
provided speed as well as accuracy data. In Part I of the test the time 
per item could be neither controlled nor measured. It is concluded that 
this difference in method does affect the results, and that the method used 
in Part II is to be preferred for practical reasons. 

3. Previous experience of subject group. In this experiment two quite 
different types of subjects were used: rated military personnel with con- 
siderable training and experience in the use of 2400-hour time, and high 
school students with little if any exposure to this time system. Although 
the high school students made more errors and required more time per 
test item, the last group of correlation coefficients in Table III indicates 
that the basic findings were virtually the same for either group of subjects. 
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We can conclude, therefore, that in this experiment the nature of the 
previous experience of the subjects was an unimportant variable. 


Evaluation of Results with Regard to Clock Dial Design 


1. Twelve-hour vs. 24-hour dial. Comparison of the first 5 with the 
last 6 clocks in Table 1 shows that there was no major advantage in favor 
of either the 12- or 24-hour dial, although 24-hour dials, Types G and J, 
were superior to the two best 12-hour dials, Types A and B. This was 
particularly true for speed of reading. The 24-hour clocks showed some- 
what more 1l-hour errors, probably because of the smaller spacing of the 
hour numerals. 

2. Numerals vs. no numerals on minute scale. The comparison of 
clocks A and B does not reveal any significant advantage to placing nu- 
merals on the minute scale of a 12-hour dial. In the case of the 24-hour 
dial, however, as indicated by comparison of clocks F and G, there ap- 
peared to be a definite advantage in favor of numerals on the minute 
scale. Dials without numerals on the minute scale showed a considerably 
higher proportion of 5-minute errors in Table 2. 

3. One-minule vs. 5-minute graduations on the minute scale. Com- 
parison of clocks A and C, and F and I indicates a significant difference 
in favor of placing graduations at 1-minute intervals when readings are 
required to an accuracy of one minute. Clocks C and I showed a high 
proportion of 1-minute errors. 

4. Numerals at all hourly positions vs. replacement of some numerals 
with mere reference marks. Comparison of clocks A and D, and clocks F 
and H indicates a loss in accuracy when numbers were omitted at some 
of the hourly divisions. 

5. Addition of a 13- to 24-hour scale on a 12-hour dial. Clock E, with 
the 13- to 24-hour scale added was inferior to clock A without such a scale. 

6. Placement of the 24-hour position at the top vs. the bottom of a 24-hour 
dial. Clock G, with the 24-hour position at the top, was best in Part I of 
the test, whereas J, with this position at the bottom, was best in Part II of 
the test. This would suggest that in a situation where an individual 
can become accustomed to reading a particular clock, as in Part II of 
the test, there is some advantage to placing the 24-hour position at the 
bottom of the dial. 

7. Placement of the 60-minute position at the top vs. the bottom of the 
24-hour dial. The results for clock K show quite clearly that the uncon- 
ventional location of the 60-minute position at the bottom of the dial 
caused a high percentage of errors. 











Speed and Accuracy of Reading Clock Dials 169 


Summary 


This experiment was carried out to study design factors which influ- 
ence the speed and accuracy of reading clocks in the military or 2400-hour 
time system. Eleven different designs of clock dials were used, of which 
five were variations of the 12-hour dial and six were variations of the 24- 
hour dial. Reproductions of these dials were presented in a printed test 
divided into two parts arranged so as to make possible a determination 
of both speed and accuracy of clock reading. This test was given to 62 
rated Army Air Forces officers and to 100 high school students. 

The following conclusions are drawn concerning clock dial design 
factors which influence speed and accuracy of readings in 2400-hour time: 


1. A 24-hour dial is slightly superior to a 12-hour dial. 

2. Dials with numerals on the minute scale are superior to dials with- 
out such numerals, particularly on 24-hour clocks. 

3. Lack of one-minute graduation marks reduces speed and accuracy 
(when readings to an accuracy of one minute are required). 

4. Lack of numerals at all hourly positions reduces speed and ac- 
curacy, particularly on the 24-hour type dial. 

5. A 12-hour dial with a 13- to 24-hour scale added is not superior to 
a dial without this additional scale. 

6. Placement of the 24-hour position at the bottom of a 24-hour dial 
appears to be superior to placement at the top. 

7. Placement of the 60-minute position of a 24-hour dial in its con- 
ventional location at the top is superior to locating it at the bottom. 


Received August 11, 1947. 








The Effect of Instrument Dial Shape on Legibility * 


Robert B. Sleight ** 
Division of Education and Applied Psychology, Purdue University 


In spite of a wide diversity of instrument dial types in use today, objec- 
tive evidence is lacking which designates one type of dial as more desirable 
than another from the standpoint of legibility. In this study, compari- 
sons were made of five dials of different shapes, all in common use for 
certain purposes. It was the aim cf the study to determine the relative 
legibility of these several differently shaped dials. 


Historical Background 


Most instrument dials in use today are of the conventional round 
type with moving pointer; but many people interested in the problem of 
instrument dial design have seen the possible desirability of other types. 
Greatest interest in instrument dial design, especially from the stand- 
point of legibility, has been expressed by those concerned with aircraft 
instruments, in the reading of which speed and accuracy are often of 
vital importance. Beal (1, p. 440) comments on this point as follows: 
“The fundamental fact is that the pilot must be able to read all flight 
instruments quickly and accurately.” 

Other references to dial shape have been made in the literature; for . 
instance, Stewart (21, p. xvii) mentions the use of edgewise dials in place 
of circular dials in aircraft, principally as a means of saving valuable space. 
Eaton (5, p. 9) is of the opinion that further development of the vertical 
or straight dial for use in aircraft might result in considerable convenience 
to the pilot. Hibbard (9, p. 759) suggests, concerning the type of dial 
needed for certain purposes, that: ‘Research should be done in the design 
of an altimeter having on its face an open window in which altitudes can 
be presented directly as figures.” 


* This research was carried out under subcontract between the Purdue Research 
Foundation and The Johns Hopkins University. The subcontract is part of contract 
N5-ori-166, Task Order I, between the Special Devices Center, Office of Naval Research, 
and The Johns Hopkins University. This paper is Report No. 166—-1-33 under that 
contract. 

** The author wishes to thank Dr. J. A. Bromer, Director of the Instrument Dial 
Design Research at Purdue University, Prof. E. J. Asher, Dr. 8. E. Wirt, Mr. E. E. 
Dudek, and Mr. J. G. Gleason, also of Purdue University, who provided much advice 
and assistance in the performance of this study. 
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One editorial (30) expresses a belief that it is “more convenient to the 
user of a dial for calibration to be equally spaced. . . . This may be 
accomplished by using an arc of wider radius than the circle which could 
be accommodated in the same space; or by mechanical translation of an 
arc reading into a straight band reading.” 

Riggs (20) developed a counter-type indicator to replace the microm- 
eter scale on certain optical devices. This counter, in later tests, 
proved to be markedly superior to the standard scales in terms of reading 
errors made by the operators. Slightly more setting errors were made 
with the counter than with the conventional scale. 

Chapanis (4) found that when numerical information has to be read 
from a piece of equipment, a counter-type indicator is a more efficient 
method of presenting information to an operator than an annular dial. 
If the visual indicator must be used for setting information into the equip- 
ment, however, a counter is not as efficient as a dial. 

Additional confirmatory evidence of the same sort was obtained in 
studies by the Applied Psychology Panel, NDRC, which showed a superi- 
ority of open window or counter-type indicators over micrometer scales 
when operators are again simply required to read the scales. 

The Great Britain Air Ministry (29), in 1941, recommended concern- 
ing dial design that “in future design there would be visual advantages 
in designing dials for night use to disclose only that part which needs to 
be read.” They suggest for this, ‘‘a moving disc behind an aperture 
with a fixed pointer.” 

It is not difficult to understand why few changes have been made 
from the common round type of dial. Besides the natural reluctance to 
vary from the type of dial which habit has established, there is the matter 
of engineering convenience. This latter factor, to some extent at least, 
has been responsible for the prevalence of the round dial, because of the 
relative ease of obtaining circular movement of a center pivoted pointer. 
Lester (11, p. 80) points out another feature of the round (clock type) 
dial which makes it convenient from a design stand-point when he says, 
“Designers can wrap ten inches of scale around a three inch dial . . . .” 

It is evident that many factors may limit the usefulness of a certain 
shape dial; for instance, McFarland (17, p. 429) observes that “the length 
of scales may otten preclude the use of linear dials and necessitate the 
use of circular ones .. . .” 

Legibility is naturally only one of the factors which must be con- 
sidered in determining the applicability of a specific type of dial for a 
certain purpose. [Illustrating this point is a recommendation made by 
McFarland (17, p. 429) that “in instruments, as in controls, it would be 
desirable wherever possible to have the action of the indicator correspond 
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to the effect that is being produced on the plane or the unit.”’ For ex- 
ample, the use of a vertical dial on altimeters, the pointer moving up- 
ward as the plane ascends, might be an advantage over a dial which lacks 
this symbolic feature. 

The few experimental studies which have been made of dial legibility 
are of a recent nature. During World War II, Loucks (13, 14, 15, 16) 
directed research in which several features of the currently used aircraft 
instruments were studied from the standpoint of legibility. The major 
conclusions derived from these extensive comparisons of dials were as 
follows: 1. The accuracy with which comparable dials were read de- 
creased as the number of scale divisions increased. 2. The numbering of 
subdivisions tended to decrease the accuracy with which an instrument 
could be read during an exposure of 0.75 second. 3. A reduction in the 
width of a pointer that had partly obscured the smaller numbers and 
scale divisions did not improve accuracy. 4. An increase in the height 
and thickness of letters did not necessarily improve legibility. 5. The 
starting point of a scale had no significant influence on its legibility. 6. 
Mid-division lines that change in value from one part of the scale to 
another proved confusing and gave rise to increased errors. 7. Luminous 
tipped pointers were decidedly inferior to standard hands, but a narrow 
luminous strip along the length of the pointer was satisfactory. On the 
whole, the more simply a dial is designed within the limits of the desired 
accuracy, the less difficulty there will be in reading it. It was found by 
Loucks that even with modification the majority of the dials available 
for his studies had a low degree of legibility in terms of the percentage 
of error during exposures of 0.75 and 1.5 seconds. 

In a study conducted by Vernon (22) on dial and scale reading it was 
reported that: “If an individual is required to read a number of dials 
rapidly, his speed is unlikely to be as great or as regular when the dials are 
differently graduated as when they have tLe same scale graduations.” 

A study planned and directed by Grether (8), on speed and accuracy 
of dial reading in relation to the dial diameter and spacing of scale divi- 
sions, was recently reported. The main conclusions of the study were 
as follows: 1. “The accuracy with which the position of a pointer can be 
read in terms of degrees on a circular scale increases, within limits, as a 
function of dial diameter and frequency (or proximity) of scale divisions. 

. .” 2. “Speed of dial reading is not systematically related to either 
dial diameter or angular spacing of scale divisions.” 

The influence of convention and habit, as has been intimated before, 
undoubtedly is a compelling force in creating a preference for a particular 
type of dial. This can, however, perhaps be eliminated by proper vali- 
dation of experimental findings and training in the use of dials which are 
desirable from the standpoint of optimum design characteristics. 
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Most commercial companies have been guided in their instrument 
dial design by standards developed through conference methods and 
customers’ wishes and seldom by experimentally determined criteria. 
(Twenty-two instrument and dial manufacturing companies were con- 
tacted by the author concerning instrument dial design practice.) The 
military as well as civilian enterprises have come to realize that with the 
advent of faster, more complicated machines, there has been emphasized 
the problem of precise control. Advancement of instrumentation is the 
means of accomplishing this needed control. This instrumentation has 
often developed without due regard for the capabilities of the human who 
is to employ it (26). That this tendency is being replaced by a concern 
for the human factor is illustrated in the following quotation: ‘“‘New in- 
struments will have new faces, easier to read and interpret. . . .” (28) 

To illustrate the variety of uses to which instruments may be put, the 
following classification of instruments according to function, as given by 
Behar (2), is included here: 


. Balancing 7. Detecting . Registering 
. Checking 8. Indicating . Sampling 

. Controlling 9. Integrating . Signaling 

. Counting 10. Measuring . Testing 

. Curve-drawing 11. Metering . Timing 

. Cycling 12. Recording . Totalizing 


It can be seen that instruments and, hence, instrument dials, serve 
varied purposes. It is important to note at the outset that this study 
investigates only one feature of the over-all legibility aspect of an in- 
strument dial. Besides the dial shape, there are many dial character- 
istics which are of importance in determining dial legibility. These in- 
clude such dial features as size and style of numerals; size and style of 
graduation marks, number and spacing of marks; shape, size, and direc- 
tion of movement of a pointer; color and contrast of areas of the dial. 
Although some of these features have been examined to a limited degree, 
further study is needed before optimum dial specifications can be objec- 
tively stated. 


Definition of Terms 


Legibility as the term is used in this report carries the connotation of 
recognizability and in addition, meaningfulness. Paterson and Tinker (18) 
use legibility in referring to meaningful reading of printed matter. Most 
studies of legibility in the past have been concerned with various aspects of 
printed and written material, in connection with the meaningful reading of 
words, letters, etc. 

apron of instrument dials, then, is essentially the degree to which it is 
possible to gain meaningful information from given indications. 
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Legibility should be distinguished from two other closely related concepts, 
namely, acuity and interpretability. Acuity is most often defined as the 
ability to distinguish fine detail. This perception is of a visual threshold 
nature and is not necessarily meaningful. Interpretability is a more complex 
pepo y in that it indicates a condition of preparedness for a complex response 
as well as recognition and meaningfulness. 


Criterion 


The criterion of dial merit chosen for the present study was that of 
legibility, as measured by the comparative accuracy of readings made 
from various types of dials. Other criteria might be well worth considera- 
tion, such as measures of work decrement in activities involving dial 
reading. No other criterion than that of legibility, however, has been 
included in this study. That this criterion is probably a desirable one is 
indicated by Kelly’s comment (10) that: “Criterion specification for de- 
sign of an improved instrument panel is optimum legibility.” 


Experimental Procedure 
Choice of Experimental Method 


Some of the experimental techniques used in the studies on legibility 
of printed matter are applicable to experimentation in dial design. Five 
eommon methods of measuring legibility have been summarized by Burtt 
and Basch (3) as follows (these are given as referred to in studies of legi- 
bility of type faces): ‘‘(1) maximum distance at which type may be read; 
(2) time taken to read a passage; (3) number of letters read in a tachis- 
toscopic presentation, or minimum exposure at which they can be read; 
(4) minimum illumination under which they can be seen; and (5) extent 
to which letters can be thrown out of focus and still be identified.” 

The method used in this study was essentially a form of the third 
method noted above, but in this case, dial reading accuracy in a tachis- 
toscopic presentation. 


Subjects 


The subjects used in this study comprised a group of 60 male uni- 
versity students, principally elementary psychology students. In addi- 
tion, five subjects were used in a preliminary experiment and their re- 
sponses are considered in the discussion. 

The only selection factor operating in the choice of these subjects 
was a brief screening test designed to determine that the subjects had 
“normal” visual acuity (corrected or uncorrected) for the testing dis- 


tance used in the experiment. The acuity target used was the Snellen 
Rating Reading Card. 
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Apparatus 


Tachistoscope. The apparatus used in this study was a form of mirror 
tachistoscope of original design. This tachistoscope operates on the principle 
that glass is transparent when an illuminated object lies behind it, while the 
same glass functions as a mirror when an illuminated object lies in front. 

An explanation of the tachistoscope will be facilitated by reference to the 
schematic plan shown in Figure 1 and the cutaway view in Figure 2. (Letter 
notations that follow refer to those shown in Figure 1.) 
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Fic. 1. Schematic plan of the tachistoscope. 


The exposure apparatus consisted of a large black interior-painted box 
which had a partially reflecting mirror M mounted inside at an angle of 45 
degrees to the observer’s line of sight. At the front of the box was an opening 
through which the observer could view the stimulus material S by looking 
through the transparent mirror. The stimulus material was inserted in an 
opening at the rear of the box. The center of the front and rear openings 
were at eye level for the seated observer. 

The pre-exposure area was obtained by means of light from tubular lamps 
in the top chamber. This light illuminated a sheet of opal glass G;. The 
observer from his position at O could see the image of this lighted area reflected 
from the surface of the mirror M. This lighted area served to maintain at a 
constant level the observer’s light adaptation. 

Light from tubular lamps in the lower compartment passed through a 
sheet of opal glass G; and was reflected from the bottom surface of the mirror 
on to the stimulus material. Baffles of thin sheet metal set over the opal glass 
G: served to prevent direct illumination of the stimulus material which would 
have produced uneven brightness. 

When the bottom lights were “‘on” and the top lights “off” it was possible 
for the observer to look through the transparent mirror and view the stimulus 
material 8; when these lights were reversed the reflection of the opal glass G, 
was visible and the stimulus material was obscured. 
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In this apparatus the pre-exposure field G; and the exposure field 8 were 
constructed in such a manner that the areas as viewed by the observer were 
equal. Also the intensity of illumination of the pre-exposure lights and the 
exposure lights was adjusted by means of perforated shields so that these fields 
had even and equal brightness. 

The focal distance for the observer’s eye was maintained constant by having 
the distance from the observer’s eye to the stimulus material equal to the 
distance from the observer’s eye to the top surface of the mirror horizontally, 
plus the distance vertically from the mirror to the lighted pre-exposure area G;. 











Fie. 2. Cut-away view of the tachistoscope. 


The observer was seated in an adjustable chair with his eyes level with the 
middle of the front opening. The experimenter was stationed at the rear of 
the box where he could adjust the dial pointers to desired settings. 

The brightness of the pre-exposure and exposure fields for this experiment 
were equated at a brightness level of 4.10 foot-lamberts as measured using a 
Macbeth Illuminometer. The contrast between the dial background and the 
dial markings was high. 

Tims Me Ot pon The timing mechanism used to regulate the exposure 
time for the stimulus material was of the electronic type. This timer made 
possible control through a range of times from one-sixtieth to one second with 
a continuously adjustable control. The exposure times used in the preliminary 
experiment were as follows: 0.28, 0.20, 0.17, 0.14, and 0.12 second. For the 
main experiment a time of 0.12 seconds was chosen because, due to the sim- 
plicity of the dial reading problem, a brief exposure was necessary to provide 
sufficient errors to differentiate among the dials. 

The timer was activated by a push button at the discretion of the experi- 
menter. The electronic timer itself served to control a direct current relay 
which acted to ‘‘make”’ and ‘‘break”’ the current flowing to the pre-exposure 
and exposure lamps. 
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Stimulus Materials 


The stimulus materials consisted of five dial types, as shown in Figures 
3 to 7, inclusive. (Representative settings are shown in these Figures 
as the subject, participating in the experiment, viewed the instrument 
dials. Thesubject, however, viewed them singly.) The features of these 
dials other than the dial form or shape were intentionally held as constant 
as possible as an aid in decreasing the number of dependent variables. 
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Fics. 3-7. Photographs of dials used in the experiment 
(approximately % actual size). 


Among the features of the dials which were held constant were the 
following: numeral dimensions and form, size of graduations, distance 
between graduations, position of the numerals and dimensions of the 
pointers. 
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The dials were made up on matte finished white drawing paper using black 
India ink, the No. 4 pen, and the 3506 template of the standard LeRoy Letter- 
ing Outfit. The numerals used closely approximated the Army-Navy numeral 
specifications for instrument dials. After these drawings were made they were 
glued securely on the surface of twelve inch square plywood panels and pointers 
were attached in a manner which permitted adjustment by the experimenter 
through reference to an indicator on the rear of the panel. The iemnctee of 
the round dial used was about two and one-half inches. The circumference 
of this dial was about eight inches. The dimensions of the other dial types 
were derived from those of this round dial. 


Dial Settings. The pointer settings actually used in the experiment 
were determined by using the settings possible on the numerals or mid- 
way between two numerals, i.e., on the major graduations or on the minor 
graduations. 

Dials Chosen for Study. A brief survey of dials in common usage was 
made by checking the available literature. Most thoroughly checked 
in this survey were the dials illustrated in several catalogues of instru- 
ment and dial manufacturing companies. The more common different 
types noted were those used in this study. 


Experimental Design 


The dials were presented to the subjects in a systematically rotated 
fashion; e.g., the first subject read the horizontal dial first, then the ver- 
tical, next the round, etc.; the second subject read the vertical dial first, 
then the round, next the open-window, etc. By thus rotating the order 
of presentation of the dials with succeeding subjects, there was accom- 
plished an effective and yet simple method of dealing with the problem of 
practice and fatigue effect. 


Administration Procedure 


The procedure for the administration of the experiment as carried out by 
the experimenter was as follows: 


1. S was seated in an adjustable chair so that the center of the viewing 
aperture was at eye level. 

2. E placed the acuity card (Snellen Rating Reading) in the panel holder 
and switched the exposure light to the steady “on” position. E said to 8, 
“Please read the letters in the 30 group in reverse order. Now the 25 group.” 
(E made certain that S had “normal” acuity.) 

3. E switched the pre-exposure light on. S was shown the first dial through 
the transparent mirror, illuminated by the rear light, and was told, ‘‘This is 
the first dial which you will read.” (In each case E pointed out where the 
0 and 9 numerals were located.) 

4. E placed the dial in the holder. 

5. E said to 8, “The lights in the apparatus are so arranged that they will 
flash on and off and you will have a brief view of the dial. I will make the 
settings before the light flashes. All settings will be directly on a number or 
midway between two numbers. You will give the reading shown by the 
pointer.” 





Effect of Instrument Dial Shape on Legibility 179 


6. S was advised that he would be given some practice readings. S was 
given random settings until he answered two successive readings correctly. 

7. Before activating the timer, 8 was prepared by E saying, ‘“‘Ready, now!”’ 
Then E pushed the button and pte y the timer to expose the dial. 

8. E made the settings as listed on the test sheet. 

" 9. E recorded errors made by S opposite the actual setting on the data 
sheet. 

10. For successive dials the instructions to S were abbreviated. S was 
shown the next dial and told, ‘This is the next dial.” The subsequent pro- 
cedure was repeated as for the first dial. 

Comments on Experimental Set-up. In connection with the experimental 
set-up, the following points seem worthy of some discussion: 

1. The brightness of the pre-exposure, exposure, and post-exposure fields 
was equal. This permitted the subject to maintain a constant condition of 
adaptation. It is reported that a dark post-exposure field allows the retinal 
after-response to supplement the exposure. A very bright post-exposure field 
washes out the retinal image relatively quickly (25, 27). In general, constant 
light adaptation is preferable in studies of this type because if the pre-exposure 
field is dark, the time between successive exposures would need to be constant 
to provide comparable conditions. 

2. The illuminated area of the pre-exposure, exposure, and post-exposure 
fields was equal. This tended to reduce distraction which might have resulted 
from variations in the size of successive visual stimuli. 

3. The fixation area in the pre-exposure field was at the same optical dis- 
tance as the stimulus object. This enabled the subject’s eyes to be properly 
focused and converged in advance of the brief exposure. 

4. The pre-exposure and post-exposure fields succeeded each other without 
motion visible to the eye. A slow motion would have tended to cause a pursuit 
movement of the eyes and lead them away from the fixation area. 

5. There was an absence of distractive noises and moving parts. 

6. The duration of exposures was of sufficient length to allow a clear view 
of the stimulus material but brief enough to prevent successive views. Whipple 
(25, p. 226) reports that an exposure time of 0.15 second allows only one view. 

7. The ready signal was adequate to prepare the subjects for attending to 
the exposure. Absence of a ready signal might result in shifting of view and 
momentary inattention and consequent poor performance. 

8. The fore-period was of variable duration as recommended for this type 
of experimentation (7, p. 383). A constant fore-period leads to anticipation 
by the subject and too ie or short a fore-period does not permit the subject 
to maintain a favorable “‘set.”’ 

9. The timing mechanism gave constant exposure times. 


Preliminary Experiment 


An initial pilot experiment was conducted previous to the main ex- 
periment. Data of this experiment were subjected to an analysis in 
order to evaluate the soundness of the proposed experimental design. 
Many of the experimental variables were controlled in the original ex- 
perimental plan but a knowledge of the influence of the subject and ex- 
posure time variables was desired. 

It was also hoped that this brief study would indicate whether or not 
further experimentation would be worthwhile, i.e., whether sufficient dif- 
ference would be noted in the subjects’ readings on the different dials. 
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Several exposure speeds were used, the objective being to determine 
an exposure speed such that sufficient errors would be committed, by 
randomly selected subjects, to allow distinction among the dial types on 
this basis. The exposure speeds actually used were as follows: 0.28, 0.20, 
0.17, 0.14, and 0.12 second. 


Results 


Preliminary Experiment 


The data obtained from the preliminary experiment with five male 
university students as subjects are shown in Tables 1 and 2. In these 
tables it will be noted that three variables are considered; namely, ex- 


Table 1 
Errors Made by Five Subjects in the Preliminary Experiment 








Exposure Speeds in Seconds 








Subjects 0.28 0.20 0.17 0.14 0.12 
A *H 3 0 0 S 4 R 2 V6 
B 8 2 R 2 Vv 6 H 1 0 0 
Cc V iG H 6 Oo 1 8 6 R 0 
D oO 0 S 4 R 4 V 12 H 2 
E R 3 Vv 6 H 8 0 0 8 7 





* Dial types are designated as follows: H—horizontal, O—open-window, R—round, 
V—vertical, and S—semi-circular. 











Table 2 
Analysis of Variance of the Preliminary Experiment Data 
Estimate of 
Sum of Degrees of Population 
Source Squares Freedom Variance F Observed * 
Exposure 
Speed 7.6 4 1.9 0.40 
Subject 26.0 4 6.5 1.36 
Dial Type 169.2 4 42.3 8.87 
Residual 57.2 12 4.77 — 





* According to Lindquist (12, pp. 62-65), for degrees of freedom equalling 4 and 12, 
an F of 5.41 is significant at the 1% level. 


posure speeds, subjects, and dial types. These data when studied using 
a Latin square analysis of variance technique, showed that exposure 
speeds and subjects did not account for significant variance, while 
variance attributable to dial type was clearly significant at well 
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beyond the 1% level of confidence (12). This indicated that for 
further experimentation any of the exposure speeds used in the prelimi- 
nary experiment would probably yield discriminative data for the dials. 
It was indicated also that the subjects could safely be recruited from the 
student group. 


Main Experiment 
Percentage and Statistical Significance of Errors by Dial Types. Table 
3 shows the incorrect readings made by sixty subjects on each of the five 


Table 3 


Incorrect Readings Made by Sixty Male University Students on 
Each of Five Dial Types 








Dial Types 





Open- Semi- 
Horizontal Vertical Round window circular 





Incorrect 

Readings * 280 362 111 
Percentage 

Incorrect 27.5 10.9 
Mean Number of 

Errors per Subject * 4.67 1.85 





* N for the readings—1020. 
N for the subjects—60. 


dial types. In this instance, the total incorrect readings include those 
settings on which the subjects were unable to make readings. The per- 
centage of incorrect readings in Table 3 is based on a total of 1020 settings 
on each dial. The mean number of errors per subject is based on a total 
of sixty subjects used in the experiment. Figure 8 shows graphically the 
extent of the incorrect readings on the various dials. 

As shown in Table 3 and Figure 8, there was considerable variation 
in reading efficiency for the five dials studied, errors in reading ranging from 
0.5 per cent for the open-window type to 35.5 per cent for the vertical dial. 
The five dials ranked according to the percentage of error as follows: 


. open-window.... . 0.5% 

10.9% 
. semi-circular.... 16.6% 
. horizontal 27.5% 
. vertical......... 35.5% 
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That the differences in accuracy of reading on the five dial types is 
significant throughout is shown in Table 4. This table shows an estimate 
of the significance of the differences in percentage of errors between each 
pair of dials. All of the ‘“‘t’’ values shown in Table 4 are significant at 
well beyond the 1% level of confidence (19, p. 53). (Actually, the small- 
est “t’’? value obtained in these comparisons indicated that the chances 
are only one in 5000 that the difference could have occurred by chance 
variation alone.) Thus, all differences reported are clearly significant. 


40F 


20F 





PERCENTAGE OF INCORRECT READINGS 





4 
Yj 
Mi 


J Y/ \ | |, Tn | Tn As 
HORIZONTAL VERTICAL ROUND OPEN- SEmi- 
WINDOW CIRCULAR 


DIAL TYPES 
Fic. 8. Percentage of incorrect readings made on five dial types. 


Although with the dials used in this experiment it is difficult to define 
the “actual” area covered by any dial, it appears that there was a definite 
positive relationship between what might be termed the effective area 
of the dial and the amount of inaccuracy in reading it. The open- 
window dial, with the smallest effective area, produced the least number 
of errors, while the rectilinear dials (horizontal and vertical) with greatest 
effective area resulted in a proportionately larger number of errors. 

Analysis of Errors by Pointer Settings. Figure 9 shows the frequency 
of errors according to the dial settings used for each of the five dial types. 
From this figure it is possible to visualize the relative number of errors 
made at different settings on each dial. It is evident that for nearly all 
dials more errors occurred on the mid-division settings than on the di- 
vision or whole-number settings. As a partial explanation for the occur- 
rence of most errors on mid-division settings ene might hypothesize that 
when subjects were in doubt of the exact readings, but did have an idea 
of the general area of the reading, they would report a whole number 
rather than a mid-division reading. On the other hand, this explanation 
is illogical because of the fact that on whole-number settings a portion of 
the numeral to be read was obscured by the dial pointer. This perhaps 
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poses the problem of eye-movements made by an individual in reading an 
instrument dial; that is, does the individual habitually interpolate be- 
tween the numbers on either side of the pointer or actually “see’’ the 
number on, or near, which the pointer is situated? 











— HORIZONTAL 
444, =a VERTICAL 
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s 
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115225335 4 45555 665 7758859 
SETTINGS 
Fic. 9. Total errors for each dial type according to the settings 
used in the experiment. 


Extent and Direction of Errors Made in Reading the Various Dials. 
Table 5 summarizes most of the data obtained in this experiment and 
emphasizes the extent and direction of errors made in reading the dif- 
ferently shaped dials. An examination of Table 5 shows that no errors 
exceeded plus or minus 2.0 dial units away from the true pointer setting. 
Eighty-four per cent of the erroneously reported readings were within 
plus or minus 1.0 dial units. This suggests that the subjects reading the 
dials could usually discern the general area in which the pointer was 
located, even though they were unable to make precise readings. 

The direction of the errors is indicated in Table 5 by the calculated 
constant errors on the dials. The constant error in the case of all dials 
was positive. The largest constant error of plus 0.0635 was obtained 
with the vertical dial. The next largest constant error was plus 0.0462 
with the round dial showing a tendency to overestimate with these forms. 

It will be noticed that in the calculation of the average and constant 
errors the number of blanks (omitted) readings was subtracted from the 
1020 total trials for the dial. The fact that only 27 times out of 5100 
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Table 5 


Extent and Direction of Errors Made in Reading the Various Dials 








Number of Errors 





Horizontal 
+ = 


Round 
oa as 


Open- 
window 


+ 





0 1 
7 5 
29 46 


740 
142 
130 


272 
8 


280 


78 
31 


109 
2 


851 
91 
74 


165 
4 


111 5 169 


2688 3472 


. 1071 
N=1012 N = 1008 


N=1018 


.0039 
N=1019 


.1624 
N = 1016 


+.0119 
N=1012 


+ .0635 
N =1008 


+ .0462 
N=1018 


+ .0020 
N=1019 


+ .0167 
N=1016 





trials were subjects unable to report a reading, suggests the high visibility 
of the pointer and /or the high degree of attention on the part of the par- 
ticipating subjects. 


Implications of the Findings of this Experiment 


Essentially the purpose which instruments serve is to give indications 
of existing conditions. 

There is a modern trend to make more meaningful the information 
received from control mechanisms and instrumentation. It is obvious 
that the more realistic and the more meaningful instrument indications 
can be made the greater will be the saving in time, elimination of errors, 
and resultant over-all efficiency. From the standpoint of efficiency, ex- 
cessive time spent looking at, or “reading” an instrument dial in order to 
gain information from it, is time wasted. 

The findings of this experiment show that with the use of certain dial 
types high accuracy of reading can be achieved even though the time 
during which the individual views the dial is very brief. 

The application of the findings of this experiment, especially that of 
the outstanding legibility of the open-window type dial, must be modified 
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by recognition of the purpose for which a dial is to be used. When it is 
desired that the information from instruments be of a numerical nature, 
the results of this study indicate that an instrument with an open-window 
dial is most desirable. (Counter-type dials were not studied in this ex- 
periment, but it is probable that they would show results similar to those 
obtained for the open-window dial.) In certain situations, however, 
where the instrument is desigixed to give a representation of two- or three- 
dimensional space, a dial giving numerical information seems to be less 
appropriate than one which provides a replica of the two- or three-dimen- 
sional plot (as in flight and navigation instruments of many types). 

In connection with the uses of certain dials studied in the experiment, 
legibility might be a secondary consideration. For instance, dials of the 
round and semi-circular type offer decided engineering convenience and 
their use on this basis alone might be justified. Where direction, right 
or left, and up or down, for example, is of value for increasing the meaning- 
fulness of the information presented by an instrument dial, the use of 
either the vertical or horizontal dials would have advantages. It should 
also be noted that the findings of this study refer to the single instrument. 
Further research is needed on the optimal design of. instruments to be 
used in groups or banks. 

To summarize: Many factors must be considered in the choice of a 
dial face design. When legibility, or accuracy of reading, is of prime 
importance, the open-window dial, with restricted area, seems to be pref- 
erable to the circular dial, the semi-circular dial, the horizontal, or the 
vertical dial. 


Summary and Conclusions 


1. Five instrument dial types—round, vertical, horizontal, semi- 
circular, and open-window—were compared for legibility. Legibility 
was measured in terms of accuracy of readings made by sixty male sub- 
jects when viewing the dials for a brief period. 

2. The five dial types were equated for size and style of numerals, 
marks and pointers, for contrast (black numerals, marks and pointers 
on white backgrounds), for size and brightness of backgrounds, and for 
positioning of pointer with respect to numerals and marks. Because of 
the variation in dial shape, the effective areas of the several dials varied 
considerably. 

3. Significant differences in accuracy of reading were found for the 
several dials. When each dial was compared with each other, differences 
were found which were significant in every case at the 1% level of confi- 
dence. 

4. In order of accuracy of reading the dials ranked as follows: (1) 
open-window; (2) round; (3) semi-circular; (4) horizontal; and (5) vertical. 
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Accuracy was extremely high (one-half of one per cent of readings in 
error) on the open-window dial. 

5. Errors on all dial types were more frequent on mid-division than 
on whole-number settings, in spite of the fact that on whole-number 
settings much of the number was obscured by the dial pointer. 
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Cumulative Effect of a Series of Campaign Leaflets 


R. W. Dietsch 
Cleveland, Ohio 


and 


Herbert Gurnee 
Arizona State College 


The effect of printed propaganda on the development of social atti- 
tudes has been investigated in several experimental studies.! The pur- 
pose of those studies was usually to compare printed with oral material, 
or to compare different kinds of printed propaganda, for example, emo- 
tional versus rational appeals; the material used consisted of a single 
leaflet or folder. 

There seems to have been no experimentally controlled attempt to 
measure the cumulative effect of a series of propaganda leaflets. Reports 
from the field of commercial advertisement indicate a law of diminishing 
returns in the repetition of some advertising copy, and presumably a 
similar result occurs in other kinds of publicity. It is well known that 
the public soon becomes satiated with repeated political propaganda; 
long before a campaign is over many complaints are heard about the in- 
terminable “‘hot air” of the political office seekers. 

The present study is concerned with the effect of a series of leaflets on 
the student-body opinion of a man’s college. Since our interest was in 
changing rather than in developing attitudes, we purposely sought an 
issue in which interest was high and about which diverse opinions had 
been generated. Such an issue was available in the question, widely 
discussed on many college campuses, whether the athletic program should 
be subsidized in order to provide a football team capable of competing 
with the best in the country, or whether athletics should be kept within 
purely amateur and recreational limits. 


1F. H. Knower. Experimental studies in changes in attitudes. J. Abn. & Soc. 
Psychol., 1936, 30, 522-532. G.W.Hartmann. A field experiment on the comparative 
effectiveness of “emotional” and “rational” political leaflets in determining election 
results. J. Abn. & Soc. Psychol., 1936, 31, 99-114. 
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Procedure 


During the third week of the football season, when the team seemed 
headed for a mediocre season, the following ballot was placed in the mail- 
box of every undergraduate in the college, with the request that it be 
checked and dropped into the box of the editor of the college paper. 


Do you think the college should subsidize athletics 
to such an extent that its team can compete success- 
fully against the top ranking schools in the country? 
Absolutely Yes. Yes. ? No. Absolutely No. 


The ballots were ostensibly secret; actually each ballot was numbered 
on the back and the returns were recorded according to the name of the 
mailbox assignee; 427 students returned the ballot, approximately 60% 
of those receiving it. Their votes were distributed as follows: 


Absolutely Yes: Yes: ?: No: Absolutely No: 
49.0% 25.1% 2.3% 10.3% 13.6% 


Some students amplified their responses in the form of essays attached 
to the ballot. There were 18 football players and, to a man, they voted 
“Absolutely Yes.” 

On the basis of the returns, the 427 students were divided into four 
groups so arranged that the percentage of votes falling in each of the five 
categories of response was the same in each group. That is to say, each 
of the four groups was made to comprise 49% who voted “Absolutely 
Yes,” 25% who voted “Yes,” and so on as above. One of the groups 
constituted a control and received no leaflets; a second group received one 
leaflet, a third received three leaflets, and a fourth five leaflets. The 
leaflets were distributed a week apart on a Tuesday afternoon, a time 
approximately mid-way between football games. The content in all leaf- 
lets was strongly against subsidization. The first of the series read as 
follows: 


“Every football season someone always shouts ‘R.. . should val go in 
for athletics in a big way. R... needs more football players. Subsidize, 
Subsidize.’ No statements could be more irrational or even silly. If the 
college had any extra money it could use that cash for more beneficial purposes 
than subsidizing its football and basketball teams. What about the student 
building? What about the men’s dormitory? What about the soda grill in 
the basement of the main building? Stack these up against subsidization and 
what do you have? A union would be good for the University. What would 
be better than the soda grill for an afternoon date? And the dormitory would 
be perfect for an out-of-town man or even one from the city. Subsidize ath- 
— Not until we have at ieast a union, or a soda grill, or even a men’s 
orm.” 
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The second, third and fourth pamphlets were similar in form and 
content to the first. The fifth was somewhat different. It is given 
below: 


“An excellent answer to those men who would have athletics subsidized 
can be found in the following letter written by A. J. F. of R.... He writes: 
‘Subsidized athletics are on the way out all over the United States—witness 
Chicago, M. I. T., Johns Hopkins, and many others. A school that needs a 
powerful athletic machine to advertise itself is not worthy of being called a 
school—it’s just a play ground. At any rate, R... does not need any greater 
lure for students than its academic departments. Being from the South, I 
know what a great reputation it has. Why not more subsidies to the various 
academic departments? Intra-mural and local athletics such as R. . . goes in 
for seem to provide enough school spirit and athletic activity for all normal 
purposes. To subsidize and thus enlarge the athletic department seems to me 
to be a needless and undesirable move. R... is, has always been, a SCHOOL 
first of all. Enough “glory to old R.. .” is given by its great teachers. Edu- 
cation is, after all, the primary purpose of any true college.’ ” 


The leaflets stimulated much discussion among the students. The 
football squad and even the coaches talked vigorously about them. Sev- 
eral students wrote to the college newspaper and demanded that they be 
stopped and that the identity of the distributor be revealed. The fact 
that the experiment was being carried on during the football season un- 
doubtedly added to the heat of the discussions. 

One week after the final leaflet was distributed, ballots identical with 
the first were again placed in the student boxes. Not all of these were 
returned, and some that were returned had to be discarded to equalize 
the groups on the basis of the initial ballots; the groups obviously had to 
be equalized in the five categories of response before sound comparisons 
could be made. This left 350 subjects from whom usuable returns were 
tabulated. 


Results 


Our data give a measure of the effects on one, three, and five leaflets. 
We had hoped to measure the effects of the second and fourth leaflets 
also, but there were not enough subjects to justify a division into two 
additional experimental groups. We could have obtained such a measure 
by interjecting extra ballots at these points, but this would have intro- 
duced a variable which we thought it advisable to avoid. 

The first leaflet produced a significant decrease in favorable opinion 
toward subsidization. The per cent who were ‘“Absolutley’” in favor 
dropped from 49 to 16.1. The measure was obtained, of course, five 
weeks after this first leaflet was distributed, since the final ballot was 
given to all groups at the same time, namely, one week after the fifth 
leaflet was distributed. During this same period the control group 
dropped from 49 per cent to 42.5 per cent in the ‘Absolutely Yes’ cate- 
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gory, a loss of only 6.5 per cent. Assuming the groups to be comparable, 
and they were comparable in distribution of original opinions at least, it 
seems evident that the single leaflet produced a definite positive effect as 
far as expression of opinion is concerned. 

The various changes are presented in Table 1; the figures represent loss 
or gain in votes in relation to position on the original ballot. The great- 
est changes were in the “Absolutely Yes’’ category; thus the one-leaflet 
group dropped 32.9 per cent (from 49.0% to 16.1%), the two-leaflet group 
dropped 37.5 per cent, and the five-leaflet group 31.8 per cent (49% to 


Table 1 


Loss or Gain in Per Cent of Votes in the Five Categories of Response 


Note: The loss or gain (— or +) is with reference to the total vote on the original 
ballot. 








One Leaflet Three Leaflets Five Leaflets Control 





Absolutely Yes — 32.9 —37.5 —31.8 —6.5 
Yes +12.6 +10.2 +11.4 +4.5 
Undecided +17.2 +23.9 +17.0 —1.1 
No + 12 0 + 1.1 +3.5 
Absolutely No + 23 + 3.4 + 23 0 





17.2%). These changes from the “Absolutely Yes’ position caused a 
necessary increase in certain of the lower categories, with the neutral 
position apparently accumulating most of the shifts. 

Three leaflets obviously produced a slightly greater effect than one; 
but the difference is strikingly small and seems harldy enough to justify 
the additional expense and effort involved. Thus, with respect to the 
results of the three-leaflet series, 88 per cent of the work appears to have 
been done by the first leaflet. What may have been the situation after 
the second leaflet we cannot say, but it seems extremely doubtful in view 
of the trend that the second pamphlet accomplished anywhere near the 
effects of the first. 

More surprising is the effect of the five-pamphlet series. The end 
results are almost exactly the same as for the first leaflet. The law of 
diminishing returns seems to be working here with a vengeance! The 
slight gain on the third pamphlet is wiped out. Five pamphlets are thus 
no better than three. These results are the more striking when we re- 
member that the time interval between the leaflets and the final ballot 
was least advantageous to the first and most advantageous to the fifth 
leaflet; thus the superiority of the first leaflet must have been even greater 
than the data show. 

Of course the content of the last pamphlets may have been responsible 
for this decline, although this is doubtful; it is more likely a normal re- 
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fractoriness to what was taken to be an overdose of propaganda. An- 
other possibility is the influence of some unknown extraneous influences, 
perhaps newspaper discussions during the intervals; but if such influences 
were present, they do not appear in the votes of the control group. 

Then again, the factor of timing may have made a difference. One 
week apart may have been too short an interval; it is difficult to believe 
it may have been too long, for refractoriness is known to increase with a 
shortening of the time interval. With intervals of a month possibly the 
effects would have been more cumulative; although with longer intervals 
the element of forgetting naturally becomes greater. 

Another element in timing is to hit an interest when it is most active. 
Our final leaflet and the final ballots were near the close of the football 
season, when student interest in football issues was possibly on the de- 
cline. In this particular college the most crucial game is traditionally 
the last, and the spirit then is at the peak; but there still may have been 
a drop of interest so far as issues like subsidization were concerned. 

The data show another interesting fact, one that has been observed in 
studies of debate audiences, namely, that very few subjects change from 
one side of the neutral point to the other. Thus there were almost no 
additions to the ‘‘No” and the “Absolutely No” categories of response. 
Twenty-four per cent of the students were opposed to subsidization at 
the beginning, and their number was increased by only 3 or 4 per cent at 
the end; this is practically no change since the control group shows almost 
the same results. Most of the changes were toward or into the neutral 
positions from the ‘‘Yes”’ positions; there ,a psychological barrier seemed 
to prevent all but a very few from crossing over. This “barrier” is an 
interesting problem for further research; what conditions affect its per- 
meability, and how? 

Table 2 


Degree of Change Resulting from the Leaflet Series 


Note: Figures are the per cents of subjects changed towards the “Absolutely No” 
end of the scale. 








One Leaflet Three Leaflets Five Leaflets 


One Step 35 42 36 
Two Steps 10 12 10 
Three Steps 1 2 0 
Four Steps 0 0 0 








Another indication of resistance is the small number of subjects who 
changed more than one step in the scale. The figures are given in Table 
2. Figures are for changes in the intended direction. Changes in the 
other direction were extremely small, totalling not more than five per 
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cent. Note that only three per cent of the subjects changed more than 
two steps, and this could be expected from chance alone. Approxi- 
mately ten per cent changed two steps, in most cases from ‘Absolutely 
Yes” to “Undecided.” 

Here again, it can be seen that three leaflets produced only a slightly 
greater effect than one, and the effect of five leaflets is almost exactly 
that of the first in the series. 

It should be pointed out that the above effects might have been difier- 
ent a month or six months after the series ended. Although the amount 
of change produced by five leaflets was no greater than that produced by 


one leaflet it is quite possible that this change may have been more 
enduring. 


Summary 


1. Several hundred college men were presented a series of leaflets 
against subsidization of college athletics. The leaflets were distributed 
one week apart, after an initial ballot of student opinion had been taken. 
A control group and three experimental groups were set up on the basis of 
the initial ballot. A final ballot was taken upon completion of the leaflet 
series. 

2. The group receiving one leaflet showed a significant change in the 
intended direction, forty-six per cent of the subjects shifting towards the 
“No” end of the scale. 

3. The group receiving three leaflets indicated a slightly greater 
change, but hardly enough to justify the additional effort and expense. 

4. The group receiving five leaflets manifested almost exactly the 
same amount of change as the group receiving but one leaflet. 


Received October 14, 1947. 
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A Validating Study of the Work Preference Inventory 


George D. Lovell, Hartwell Davis, and Alfred Meacham 
Grinnell College 


Robert W. Henderson of Massilon, Ohio, has published an instru- 
ment called the Work Preference Inventory (4) which attracted the at- 
tention of the authors because it proposes to secure an indication of both 
interest and personality traits through the administration of one set of 
test items. Such an instrument, properly validated, would be of great 
use to college counselors in suggesting vocational fitness for occupations 
with known requirements and for indicating areas needing special counsel. 

The 1946 manual for the inventory gives measures of validity ranging 
from a bi-serial correlation of .71 to one of .98. The high and low ratings 
for the bi-serial correlation were determined by contrasting the answers 
given by well adjusted soldiers who had never been referred to a neuro- 
psychiatric clinic with answers given by soldiers who were to be dis- 
charged for neuropsychiatric reasons. In a personal communication to 
one of the authors Mr. Henderson indicated that the high validity coeffi- 
cients were due to having a very neurotic group to compare with the 
normal. He also suggested that further study of the inventory was 
needed with other groups and was being planned by several organizations. 

This report covers the comparison of the personality scores made by 
college students with ratings of these students by close acquaintances. 
Thus it introduces a different kind of validation from that used by the 
author of the inventory and checks the validity of the existing scale for 
college students instead of neurotic soldiers. An indication of the in- 
ventory’s usefulness for college advisement should result from such a 
study. 

The Work Preference Inventory gives ten personality scores and seven 
interest scores, although, according to the author, the testees will not 
realize it is anything other than an interest test. The personality traits 
measured are reliability, perseverance, emotional stability, creativeness, 
conservatism, ambition, masculinity, introversion, anxiety-depression, 
and neurotic index. Interest areas listed are persuasive, social service, 
theoretical, artistic, mechanical, economic, and scientific. 

The test is comprised of pairs of job descriptions such as, ‘“‘do or would 
you prefer work that is 


1. INSIDE OUTSIDE 
2. EXCITING ROUTINE.” 
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The testee is asked to show his preference for one job or the other. 
Thus in the sample above he may indicate his degree of preferences as 
follows: Prefers inside work strongly; Prefers inside work; Likes both; 
Prefers outside work; Prefers outside work strongly. 

The test was constructed to be used as an employee selection tool, or 
as a guidance and clinical instrument. By disguising the personality 
component of the test, it was felt that a more honest evaluation of per- 
sonality could be gained. 


Procedure 


The inventory was administered to a class of eighty-two college stu- 
dents attending Grinnell College. The class was composed of twenty- 
eight male and fifty-four female students, of sophomore and junior class 
standing. In administration, precautions were taken to insure that the 
testees understood the test directions properly and that they were un- 
crowded and unhurried. The test was administered in a serious manner 
and was accepted in a like manner by the students. 

The criterion of validity was obtained by the use of a graphic rating 
scale constructed for the purpose. Its construction was as follows: The 
test items that influenced scoring on each trait were listed along with the 
author’s definition of each trait in an attempt to make the rating scales 
conform as nearly as possible to the inventory. Since the test items were 
not in objective terms of any specific type of behavior, it appeared im- 
practical to relate the terms of the rating scale to each specific item of the 
test. The rating scales were therefore developed on an a priori basis from 
the author’s definition of each trait. In most instances one scale was 
developed for each trait; however, it was necessary to develop more than 
one scale for some of the traits, in order to measure all the components of 
the trait as described by the author of the Work Preference Inventory. 

Nine of the ten personality traits listed by Henderson were chosen for 
study. The tenth, neurotic index, was derived from a particular weight- 
ing of heterogeneous items and was thought to be too complex for rating. 

The first draft of the graphic rating scale was presented to a class of 
fifteen junior and senior college students who had been studying rating 
scale construction. This group made many suggestions concerning con- 
formity of the proposed sales to the trait definitions and concerning the 
wording of the scales so as to achieve apparently equal psychological 
spacing of the descriptive guides placed under each rating line. Every 
effort was made to make certain that the group fully understood the scope 
and purpose of the study, and their suggestions were in essential agree- 
ment. The attempt was made to combine the best features of a number 
of sample rating forms so that the final scales would conform to proven 
principles of rating scale construction (3, 7, 9, 10, 12). 
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A sample of the actual rating form as shown in Figure 1 will demon- 
strate the general nature of the scale. 


1. How does he get along with strangers? 


| | | | | | | 
Is completely Is nearly al- Is reasonably 
poised and ways sure of sure of him- 
converses imself and self and can 
quite freely i carry on a 
i fair conver- 
sation 





2. How does he react to social gatherings? 


| | | | | | | | | | | 
Retiring; very Limits con- Mixes as well Is one of Always self- 
self-conscious tacts to as the aver- the livelier assured; the 
and ill at one or two age members of “live wire” 
ease persons the group of any group 


Fig. 1. Sample of rating form used in obtaining the criterion. 





Each scale, as can be seen, was written in terms of a particular type of 
objective behavior. The nature of the behavior under consideration in 
each case was keynoted by an initial question and then descriptive guides 
were put in terms of the same type of behavior. The descriptive guides 
were subordinated to the keynoting question by printing them in smaller 
type. The continuity of the rating line was emphasized by making the 
horizontal line considerably heavier than the vertical division marks. 


Thorough reading of the descriptive guides was encouraged by alternating 
the high-low direction of the rating line on successive items. Ample 
space was allowed between the items in order to insure positive separation 
of ideas, and a space for comments was provided at the end of the report. 

The rating report was considered to be a valid rating of the traits 
under consideration for the following reasons: 


1. The best available techniques of graphic rating scale construction 
were employed. 

2. In the majority of cases there was agreement among the four raters 
of each individual. In less than 20% of the cases were there discrep- 
ancies of more than four scale points (out of ten) among the raters for 
each subject. 

3. The individual scales of the rating report were checked by visual 
inspection of the frequency curves. In some instances, the curves were 
slightly skewed but in each case the skew was in the direction to be ex- 
pected from the group used as subjects. For example, the resulting 
Emotional Stability curve indicated that the group tended to be more 
emotionally stable than the typical population. 

Two weeks after the inventory had been administered, the 82 subjects 
were told the nature of the study being made and each was asked to list 
five of his closest campus friends. The names were listed on a mimeo- 
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graphed form in order of intimacy, together with information concerning 
length and nature of association. Four of the five friends listed by each 
subject were asked to rate him. Assistants were present during the 
scheduled rating hours to answer questions and check over the instruc- 
tions with each rater. In all contacts with the raters, emphasis was 
placed on the fact that the rating data would be treated in strict confi- 
dence. Excellent cooperation was demonstrated by all students involved. 
The serious attitude of the raters was demonstrated by the fact that a 
high percentage of the students supplemented the rating report by adding 
comments in the space provided. 

The general method of scoring and averaging the rating scales and 
the method of correlation used are as follows: 

In most cases one scale was developed to measure each trait; however, 
in some instances a single scale was felt to be inadequate and two or more 
scales were used to measure the component parts of those traits. When 
more than one scale was used, the scales were weighted in such a way as 
to place each trait score on an equal basis. 

The four rating scales for each subject were then averaged by traits 
and recorded along with the test scores for the corresponding traits. A 
frequency distribution for each trait was compiled from the averaged 
rating scale scores and a mean and median were determined for each 
distribution. The rating scale scores were then labeled with a plus or 
minus depending upon comparison with the central tendency. A bi-serial 
correlation was then used to compare the test scores with the plus and 
minus rating scale scores. 


Results 


The results shown as bi-serial r’s between the test and averaged ratings 
for each trait are presented in Table 1. 


Table 1 


Bi-serial correlations between ratings and test scores for 9 traits. 
N = 82 college students 











Trait Bi-Serial r 
Reliability 08 + .09 
Perseverance .26 + .09 
Emotional Stability —.14 + .09 
Creativeness 14 + .08 
Conservatism —.07 + .08 
Ambition 19 + .09 
Masculinity 39 + .08 
Introversion 31 + .09 
Anxiety-Depression .25 + .09 
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Conclusions 


1. On the basis of the a priori rating report devised for this study, it 
would appear that none of the nine measures of the Work Preference 
Inventory, as listed in Table 1, is valid as a measure of personality traits 
of a normal college population. 

2. Until further work is none to improve the test, its usefulness as a 
clinical tool in college counseling would be of doubtful value. 

3. Since the subjects used in this study were presumably representa- 
tive of a normal college population, these results would not discount the 
value of the test as a clinical aid with a more deviant population. 


Received September 23, 1947. 
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Influence of College Science Courses on the Development of 
Attitude Toward Evolution 


Key L. Barkley 
Woman’s College of University of North Carolina 


Some doubt has been raised as to whether the curriculum studied in 
in college makes much difference with respect to changes in students’ 
attitudes toward such things as law, the church, the constitution, war, 
and God. These attitudes are more or less general in nature and perhaps 
not very specifically related to any introductory college course or curricu- 
lum. It would appear, however, that attitude toward evolution would 
be more specifically related to studies of science, and subject to change 
by reason of advance in such courses. 


Plan of the Experiment 


Purpose. The general purpose of the present investigation was to 
bring out any discoverable curriculum influences on development of stu- 
dents’ attitude toward evolution. The specific purposes were: (1) to 
find out whether study of science and mathematics in high school had any 
relation to students’ attitude toward evolution at the time they entered 
college; (2) to discover the changes in attitude toward evolution made by 
college freshmen in a regular college course which included two semesters 
of biology, mathematics, or chemistry, or a combination of two semesters 
each of biology and chemistry; (3) to compare the changes in freshmen’s 
attitude toward evolution with those made by students in a one year com- 
mercial course which had no science studies in it; (4) to discover the 
changes in attitude toward evolution made by upper class students who 
took a freshman science course, or an advanced course in anatomy. 

Subjects. The Freshman science groups were composed of those who 
were studying a course in introductory biology, chemistry, or mathe- 
matics, or courses in both biology and chemistry at the same time, but 
who otherwise took the same general courses. These were all separate 
and distinct groups with each student appearing in only one group. The 
upperclassmen groups were composed of the students above the freshman 
level who were in the freshman science courses, or who were taking an 
advanced course in anatomy. (The anatomy course lasted just one 
semester.) The commercial students were admitted to the college on 


the basis of the same general high school credits as required of the fresh- 
200 
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men, but the commercial students took a course which was strictly a 
business college curriculum not including any science or mathematics. 
A special group of freshmen who took introductory chemistry both semes- 
ters was also tested. 

Materials. The measuring instrument used was the Attitude Toward 
Evolution Scale made by T. G. Thurstone under the editorship of L. L. 
Thurstone, Scale 30, Forms A and B. 

Procedure. The method of test-retest was used. Form A of the scale 
was given to the commercial students and to all students in the biology 
and anatomy groups in the fall. All the other groups were given forms 
A and B in approximately equal numbers in the fall. Each student was 
then retested at the end of the course in the spring with the form which 
he had not marked previously. Some of the biology students were re- 
tested at the end of the first semester with Form B. These students had 
to be given Form A at the end of the year. 


Results 


The distribution of time given to mathematics and science in high 
school by the various groups of subjects is shown in Table 1. The cases 
of reliable differences in per cents of the groups which took the different 


Table 1 


Showing the Per Cent of Each Group of Subjects who had Mathematics and the 
Various Sciences in High School 








College Students Grouped According to Subjects Taken 





Biology 
Chem- Mathe- and Commer- 


H.S. Course Biology istry matics Chemistry cial 





General Science—1 yr. 42.3 45.9 44.6 60.7 71.8 
Biology—1 yr. 82.2 80.7 78.2 83.3 81.7 
2.7 10.9 3.6 

34.2 52.3 34.6 59.5 14.1 

15.1 13.8 4.0 19.0 22.5 

Mathematics—1 yr. 1.8 1.0 1.4 
2 yrs. 4.1 11.9 10.9 9.5 12.7 

3 yrs. 79.5 59.6 33.7 56.0 70.4 

4 yrs. 16.4 26.6 54.5 34.5 15.5 





science courses in high school are presented in Table 2. By the absence 
of the comparisons in Table 2, it will be noted that there were no signifi- 
cant differences between the freshman groups who elected college biology, 
chemistry, or mathematics in per cents of the groups who had studied any 
of the sciences in high school. Moreover, there were no significant dif- 
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ferences between any of the groups in per cents of them which had studied 
biology in high school. (Note absence of comparisons between the groups 
with respect to study of biology in high school.) It was found, however, 
that a reliably greater per cent of the freshman group which took both 
biology and chemistry in college than of any other group, except the one 
which elected chemistry in college, had studied chemistry in high school, 


Table 2 
Showing the Cases of Reliable Differences between Groups in Per Cents which Took 
ite» the Various Sciences in High School, and the Reliability Indices 
of those Differences * (Taken from Table 1) 








College Students Grouped According to Subjects Taken 

















: ; Biology and 
Biology Chemistry Mathematics Chemistry 
D D D D 
D «D>, D oD, D oD, D oD, 
Biology and Chem. 
Group 
HLS. Gen. Sci. 184 2.33 148 2.10 161 2.21 
H.8S.Chemistry .253 3.61 249 3.51 
HS. Physics 150 3.20 
Commercial Group 
H.S. Gen. Sci. .295 3.73 .259 3.60 .272 3.73 
H.S. Chemistry —.201 3.30 —.382 6.06 —.205 3.25 — 454 6.68 
H.S. Physics 185 3.49 





*The groups at the left had the larger per cents taking the high school courses, 
except where the minus signs show they had the smaller per cents. 


and that a reliably smaller per cent of the commercial group studied chem- 
istry in high school than of any of the freshman groups. It is shown also 
that a reliably larger per cent of the commercial group studied general 
science in high school than of any other group, except the one which took 
both biology and chemistry in college. 

Since, as it will be shown later, all but one of the freshman groups had 
mean scores showing a more favorable attitude toward evolution than 
that held by the commercial group, it would appear that election of gen- 
eral science in high school to the neglect of the more specific science 
courses is associated with a less favorable attitude toward evolution when 
the students get to college. This finding is not unequivoeal, however, 
since one freshman group was not reliably more favorable in attitude to- 
ward evolution than the commercial group. 

The fact that the commercial students showed a less favorable atti- 
tude toward evolution than the freshman groups should not be interpreted 
as being due to what the students learned in general science. Forty to 
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sixty per cent of all the freshman groups studied general science in high 
school. Moreover, the freshman group which elected both biology and 
chemistry in college gave so much time to general science in high school 
that it was not reliably different from the commercial group on this score. 
It also will be noted in Table 2 that the biology and chemistry freshman 
group emphasized the study of general science in high school more than 
any other freshman group; the critical ratios of the differences were all 
2plus. Even with this extra emphasis on general science, the biology and 
chemistry freshman group had the most favorable attitude toward evolu- 
tion shown by any group. It appears, then, that the study of general 
science is not in itself a hindrance to the development of a favorable atti- 
tude toward evolution, but that a more favorable attitude tends to be 
developed if the study of general science is supplemented by adequate 
emphasis on the study of more specific science courses. 

There is some indication that attitude toward evolution is associated 
with the amount of time spent studying mathematics and science in high 
school. At the first testing, the commercial group had the least favorable 
attitude toward evolution and the group which elected to study both 
biology and chemistry in college had the most favorable attitude. The 
commercial group spent the least time on mathematics and science in 
high school and the freshman group which elected both biology and chem- 
istry in college spent the most time on these subjects. The following 
tabulation showing the average number of years spent on mathematics 
and science in high school by all the groups will make this difference 
plain: Commercial, 4.9; Biology, 4.92; Chemistry, 5.04; Mathematics, 
5.25; Chemistry and Biology, 5.55. 

The more favorable attitude toward evolution shown by the freshmen 
as compared with the commercial students probably is associated with a 
greater interest in science on the part of the freshmen. Evidence of this 
probability is found in the greater amount of time given by the freshmen 
to science and mathematics in high school, their election of a liberal arts 
course instead of a commercial course, and in the highly favorable score of 
the biology and chemistry group which was composed of majors in those 
two fields. 

The mean scores of the several groups of subjects at the time of both 
testings are shown in Table 3. These scores show that in the fall the 
freshman groups who chose biology, or mathematics, or both biology and 
chemistry for their first year science courses were neutral in attitude, 
according to a scale furnished by the makers of the test. Those who 
elected chemistry only were slightly prejudiced against evolution. The 
commercial students also showed mild prejudice against evolution. The 
upperclassmen who took freshmen science courses were neutral in atti- 
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tude, but those who took the anatomy course were more advanced in 
science and were believers in evolution. In the spring, the group studying 
both biology and chemistry and the upperclassmen in a freshman science 
had stepped up to a position of belief in evolution. The formerly pre- 
judiced chemistry group moved up to a neutral position. The other 
groups remained in the same general positions they had held at the first 
testing, even though some changes in attitude toward evolution had been 
made. 


Table 3 


Showing Mean Fall and Spring Scores of All Groups of Students, Differences between 
the Mean Scores, and the Reliability Indices of the Differences * 











Fall Spring D 

Group N Mean Score Mean Score Diff. Cait. 
Biology 73 5.35 5.81 46 3.01 
Chemistry 109 4.95 5.46 51 4.47 
Mathematics 101 5.33 5.63 30 2.48 
Biology and Chemistry 84 5.62 6.41 .79 5.81 
Commercial 71 4.65 4.41 —.24 1.70 
Upper class students 33 5.82 6.01 19 1.32 
Chemistry repeats 26 5.13 5.58 45 2.00 
Anatomy 60 6.47 6.42 —.05 40 





* The formula for correlated measures was used in calculating the critical ratios in 
this table. 


Table 3 also shows the differences between the mean scores of the 
various groups of subjects at the first and second testings.' It was found 
that the freshman groups which studied biology, or chemistry, or both 
biology and chemistry at the same time made statistically significant 
changes in attitude during the year, but those who studied mathematics 
achieved a change with a critical ratio of only 2.48. Commercial stu- 
dents, upper classmen in a freshman science, chemistry repeaters, and 
upper classmen in an anatomy class did not make reliable changes 
in mean scores. 

Table 4 shows the reliability of the differences between the degrees 
of change in mean scores made by the different groups. There was no 
statistically significant difference in degree of change made by any of the 
freshman groups as compared with any other. Only the freshmen who 


1 As indicated in Table 3, the correlations between the fall and spring scores of the 
various groups were worked out and used in determining the critical ratios of the changes 
made in attitude. The following correlations between fall and spring scores were found: 
Biology group, .49; Chemistry group, .57; Mathematics group, .42; Biology and Chem- 
istry group, .47; Commercial group, .55; upper class students in a freshman science, .81; 
chemistry repeats, .49; Anatomy group, .60. 
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studied both biology and chemistry made a change which was by the most 
exacting criterion reliably greater in degree than that made by the com- 
mercial students (critical ratio of 3.75). However, the critical ratios of 
the differences in change between the commercial and the other freshman 
groups was in every case above 2.00, and these ratios indicate a high prob- 


Table 4 


Showing the Reliability Indices of the Differences Between the Degrees of Change 
Noted in the Various Groups of Students Following 
a Year’s Study in College * 








Biology and 
Biology Chemistry Mathematics Chemistry 


di—d: d; —d: di—dz d;—d: 


D_ odi—ds D_ odi—d: D «edi—ds D «edi—ds 








05 18 33 8611.18 

28 # 1.10 

61 21 89 49 2.00 

Commercial 70 2.41 75 §©2.82 54 2.10 1.03 3.75 





*The groups named at the top showed the greater change as compared with the 
groups named at the left. 


ability of significant differences. Moreover, all the freshman groups 
changed toward a more favorable attitude in sufficient degree to produce 
a reliable difference between their mean scores and the mean score of the 
commercial group at the second testing. 

Table 5 shows the differences between the mean scores of the several 
groups of subjects at the time of both testings. It is noted that the fresh- 
man groups were not reliably different from each other in the fall, except 
that the group which studied both biology and chemistry was reliably 
more favorable in attitude toward evolution than the group which elected 
chemistry alone. (It should be pointed out, however, that both the 
biology and the mathematics groups had higher mean scores than the 
chemistry group, and that the critical ratios of the differences were above 
2.0.) All the freshman groups, except the one which studied chemistry 
only, were reliably more favorable in attitude toward evolution than the 
commercial group. At the spring testing, the group of freshmen who had 
had both biology and chemistry was reliably more favorable in attitude 
toward evolution than any of the other freshman groups. The freshman 
groups who studied biology, chemistry, or mathematics were not signifi- 
cantly different from each other in mean scores. All freshman groups 
were reliably more favorable in attitude toward evolution than the com- 
mercial group. 








: 
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; 
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Table 5 


Showing the Differences Between Mean Scores of the Various Groups of Students in 
the Fall and in the Spring,* and the Reliability Indices 
of these Differences ** 








: Biology and 
Biology Chemistry Mathematics Chemistry 
Dp wd _D_ D_ 
D aise. D eats. D ss oaitt. D eats. 
Fall 27 81.36 
Biology 
Spring 60 3.03 
Fall 40 2.12 38 2.21 67 3.58 
Chemistry 
Spring 35 1.79 17 ~=1.06 95 5.49 
Fall .02 ll .29 1.59 
Mathematics 
Spring .18 .96 .78 4.78 
Fall .70 83.85 30 8361.78 68 4.15 97 5.39 
Commercial 
Spring 140 6.14 1.05 65.10 1.22 6.16 2.00 9.60 





* Groups named at the top made the higher scores as compared with the ones named 
at the left. 


** The formula for uncorrelated measures was used in calculating the critical ratios 
in this table. 


These results appear to indicate that the successful study of any gen- 
eral science or mathematics course by freshmen tends to promote change 
to a more favorable attitude toward evolution. There is a slight sugges- 
tion that the laboratory sciences may be more effective than mathematics, 
possibly because of containing more material directly related to evolution 
and a greater emphasis upon scientific approach in doing the actual lab- 
boratory exercises. 

It should be noted that the group who took both biology and chemistry 
at the same time in college had the highest average score found in any 
group at both testings. This group was composed of students who plan- 
ned to major in either biology or chemistry; they had had more science 
and mathematics in high school than the others; and they made more 
progress in the study of science in general during the first year in college. 
It would appear, therefore, that favorable attitude toward evolution is 
associated with a special interest in science and with progress in the study 
of science. 
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Summary and Conclusions 


College freshmen taking a course in introductory biology, chemistry, 
mathematics, or both biology and chemistry were given the Thurstone 
Seale on Attitude Toward Evolution at the beginning and at the end of 
these full year courses. A group of commercial students, who had a year 
of work in the same college without any science courses being included in 
their curriculum, was tested in the same manner. 

Among the freshmen only those who took both biology and chemistry 
were significantly different from any of the others in attitude toward 
evolution when first tested; those taking both science courses were more 
favorable in attitude than those who elected chemistry only. Regular 
college freshmen, however, were significantly more favorable in attitude 
toward evolution than the commercial students, except in the case of 
those who elected chemistry as their freshman science course. 

During the year, all freshmen students of science changed to a more 
favorable attitude toward evolution, while the change noted in the com- 
mercial students was insignificant. Those groups taking biology or 
chemistry alone and the one taking both biology and chemistry made 
statistically reliable changes. The difference in degree of change be- 
tween freshmen students studying the various sciences was not statis- 
tically reliable. The difference in degree of change between the com- 
mercial group and the students of science had a critical ratio of 2 plus in 
all cases and was definitely reliable between the commercial group and 
the freshman group which took both biology and chemistry (critical ratio 
of 3.75). 

At the second testing, all groups of students who had studied science 
were reliably more favorable in attitude toward evolution than the com- 
mercial students. Moreover, those students who had studied both bi- 
ology and chemistry were reliably more favorable in attitude toward evo- 
lution than were those other freshmen who had studied only biology, or 
chemistry, or mathematics. 

Several conclusions may be drawn from the findings: 


1. The characteristic attitude of the regular college freshman, at the 
college where the study was made, tends to be one of neutrality or doubt 
respecting evolution. Those who elect the one year commercial course 
tend to be prejudiced against evolution. 

2. Study of courses in science and mathematics tends to promote the 
development of a more favorable attitude toward evolution. The study 
of biology or chemistry alone or a combination of two sciences (biology 
and chemistry) appeared to have sufficient influence to facilitate a change 
in attitude which was shown to be significant by the most rigorous statis- 
tical criterion. Moreover, none of the various science courses considered 
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in this investigation was significantly more effective than the others in 
promoting changes in attitude toward evolution. 

3. Upperclassmen in the groups studied here tend to be favorable 
in attitude toward evolution, but they showed very little change in atti- 
tude following study in a freshman science or an advanced course in 
anatomy. Their attitude appears to depend upon prior studies and other 
influences. 

4. Study in a curriculum which did not contain any science courses 
tended to leave the student with his attitude toward evolution unchanged. 

5. Special interests in science, as indicated by election of more than 
one course in it and by choice of science as a major field of study, is as- 
sociated with a favorable attitude toward evolution and is also accom- 
panied by a reliably greater development toward a still more favorable 
attitude than is noted in the case of those who are non-majors and take 
only one course in science as freshmen, or who study in a commercial 
curriculum not containing science courses. Likewise, a more favorable 
attitude toward evolution is associated with more study of and progress 
in science as indicated by a devotion of a greater amount of time to mathe- 
matics and science in high school. 

6. Probably these findings cannot be too liberally generalized, but are 
indicative of the conditions and developments in the college where the 
study was made. 


Received June 24, 1947. 
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Book Reviews 


Lytle, Charles Walter. Job evaluation methods. New York: The Ronald 
Press Company, 1946. Pp. 329. $6.00. 


Job evaluation methods is essentially a description, with illustrations, 
of the steps involved in the development of job evaluation plans, and of 
the various ways in which these steps may be accomplished. 

The description of methodology is preceded by a review of the factors 
that have focused more serious attention on job evaluation in the past 
several years, a summary of the purposes of job evaluation, a discussion 
of the integration of job evaluation with other management functions, 
especially job control, a reminder of the importance of clearly defined 
personnel policies as they relate to job evaluation, and a review of per- 
tinent organizational and administrative considerations. 

The author makes no pretense of introducing any new basic methods 
or techniques; rather, he has sought to present in a single package various 
fundamental and detailed prevailing practices. The book is a distinctly 
significant contribution toward this objective not only because of the 
comprehensive and careful treatment of methodology as such, but also 
because of the generally adequate analysis of various practices, the in- 
clusion of words of warning on various aspects, and the prevailing em- 
phasis on practical considerations. Illustrative of the superior treatment 
of the subject matter, for example, are the discussions of the influence on 
the character of the trend line of the use of arithmetic versus geometric 
progression in the assignment of degree allotments, and the attention 
devoted to the building of the rate structure. 

It is to be wished, however, that the thoroughness that is generally 
characteristic of the book had been extended to include appropriate treat- 
ment of certain of the more theoretical underpins of job evaluation which 
have significant practical implications. In connection with the selection 
of job factors, for example, considerable attention is devoted to a review 
of various plans and the crystalization of a standard pattern of job factors; 
there is, however, no reference to the possible use of statistical techniques 
such as factor analysis for revealing inter-relationships among job factors 
and as an aid in identifying the most distinctly independent factors for 
inclusion in job evaluation plans. Similiarly, while there is a presentation 
of both the extreme and the more typical weightings of various factors, 
there is no suggestion of the possible statistical determination of weight- 
ings which most nearly reflect the relative intrinsic economic worth of 
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the various factors as determined by the dynamics of economic and social 
forces. In addition, the treatment of the statistical reliability of job 
ratings seems much too superficial in the light of its importance. 

Aside from these blind spots, however, and an occasional more trivial 
transgression, the book can be considered as a distinctly effective venti- 
lation system for the hazy atmosphere that characterizes some current 
thinking on job evaluation, and can be recommended for use as a uniquely 
adequate text or manual on job evaluation. 


C. H. Lawshe, Jr. 
Division of Applied Psychology, 
Purdue University 


The Society for the Advancement of Management, New York Chapter, 
1945 Conference Proceedings: Selection of sales personnel and aptitude 
testing. New York: Sutton-Malkames Co., Inc., 1946. Pp. xiii + 
137. $4.00. 


This is a report of a conference on sales personnel and aptitude held 
by the New York Chapter of the Society for the Advancement of Manage- 
ment. These proceedings comprise speeches concerning (1) the scientific 
basis of psychological tests, (2) aptitude testing in transition—from pro- 
duction to selection of salesmen, (3) the sales manpower development 
program of the General Electric Co., and (4) the use of aptitude testing 
as a management tool. A panel discussion follows the speeches. The 
material presented is quite general and contains little factual data which 
would be of assistance in guiding a testing program in industry. Some 
of the panel discussion, and especially the paper by W. H. Wulfeck on the 
scientific basis of psychological tests might be of interest to those in the 
field of testing and guidance. 


Robert M. Thomson 
University of Minnesota 


Lazarsfeld, Paul F., and Field, Harry. The people look at radio. Chapel 
Hill: The University of North Carolina Press, 1946, Pp. ix and 158. 
$2.50. 


The National Association of Broadcasters commissioned the National 
Opinion Research Center to conduct a public opinion study of attitudes 
toward radio. Then Columbia University’s Bureau of Applied Social 
Research was called in to interpret the results and prepare the report. 

The principal survey was based on a national sample of 2571 men and 
women with an extended sample of 672 respondents in the Mountain 
and Pacific time zones used for geographical breakdowns. In addition 
two supplementary surveys, involving 498 and 1091 radio listeners, were 
conducted. 
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The first chapter presents an overall appraisal of radio as an insti- 
tution. The second chapter reports the results on attitudes toward radio 
advertising, and the third chapter covers program preferences with em- 
phasis on types of programs rather than specific radio shows. The fourth 
chapter gives an analysis of certain industry problems: the educational 
level of critics of radio, the educational value of radio, attitudes toward 
selling or donating radio time for certain purposes, fairness in handling 
controversial issues, and the,role of government in the operation of radio 
stations. 

Detailed results are quoted only when they are essential. Sources 
beyond this particular study are included when necessary, and the reader 
is given the advantage of the writer’s background of experience. This 
approach results in a report that runs smoothly and does not get bogged 
down by unnecessary detail. The appendices report the complete re- 
sults: the characteristics of the sample, the questions asked and tabu- 
lations of the responses, the characteristics of the supplementary samples, 
and certain special topics such as an analysis by levels of severity of criti- 
cism and an analysis of program preferences. Thus all the detail is in- 
cluded but it does not interrupt the main line of presentation. 

One common error in many public opinion studies is interpreting the 
results as if they were absolutes when they are merely relatives. This 
report is an outstanding exception. The overall appraisal of radio, for 
example, is put on a relative basis by comparing radio with churches, 
newspapers, schools, and local government. Again, the conclusion that 
about one-third of radio listeners have a negative attitude toward radio 
advertising is accepted only after several different approaches produced 
about the same results. Skillful cross-tabulation and group analysis are 
used to give meaning to various attitude categories. The responses of 
people who reported they “feel like criticizing when they listen to the 
radio” become meaningful when it is shown that affirmative answers are 
positively related to the amount of listening. The difference between 
“not minding” and “putting up with” radio advertising is given practical 
meaning by showing the relation to the question on whether one would 
prefer radio without advertising. The only example of reporting results 
as absolutes is a minor one: quoting the average number of hours of listen- 
ing per day when there is some question of the validity of the results when 
used for anything except division into groups for relative comparisons. 

There is every indication that the report is written from a practical 
viewpoint. In one analysis the objections of lenient critics are compared 
with those of more severe critics to find what can be done to win over 
the group that can be influenced most easily. Surely this represents a 
practical approach. Throughout the report there is full recognition of 
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the practical problem of taking into consideration factors other than 
public opinion in evaluating radio. 

Probably most important of all is the impression that this was a 
completely honest project from start to finish. There is no indication of 
pointing either the questions or the interpretation in a direction favorable 
to radio. Obviously, everyone involved wanted to know the true situa- 
tion regardless of whether it was favorable or unfavorable. Some people 
may object to the use of quota sampling, some people may object to an 
approach that is more like the single-question method than the uni- 
dimensional attitude scale technique; but the reviewer doubts that any- 
one would have any basis for questioning the honesty of the operation. 


Alfred C. Welch 
Knox Reeves Advertising, Inc., 


Minneapolis, Minnesota 


Davis, Fred B., Item-Analysis data, their computation, interpretation, and 
use in test construction. Harvard University, Cambridge, Mass., 
1946. Pp. vi+ 42 and chart. $.75. 


The extensive literature and the number of computationally simple 
techniques of item analysis developed over the past fifteen years make 
questionable the need for a new method. If the need be granted, how- 
ever, this bulletin presents a sound and accurate summary of the under- 
lying logic and a method which should prove useful. The chart simpli- 
fies the conversion of data into the two basic indices,—difficulty and 
discrimination. 

The possibility of using external as well as internal criteria is 
recognized; unfortunately, the importance of external criteria is under- 
emphasized. There is, moreover, a healthy appreciation of the often- 
overlooked fact that item analysis contributes merely one line of useful 

information about the items at hand; it cannot substitute for ingenuity 
and skill in item construction nor is it more than an aid to judgment in 
revising or selecting items. 

Difficulty is measured on a linear scale with correction for “‘chance 
success” and for failure to attempt items at the end of the test. The as- 
sumptions involved in both corrections are clearly stated, but even this 
statement is not likely to prevent their being overlooked in application. 

The assumptions involved in estimating difficultyfrom only the tails of 

the distribution (upper and lower 27%) are omitted. ‘The percentage of 

) correct answers among total attempts is converted to a linear scale, using 
the normal probability integral, ranging from 1 to 99. The reliability of 
these indices based upon 100 cases in each tail (370 test papers altogether) 
is reported as about .98. In the reviewer’s experience, values as high as 
this, even for comparable samples, are unusual. 

Difficulty and discrimination are nicely distinguished, and the effect 
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of difficulty on certain widely-used discrimination indices is noted. 
The limitations of the critical ratio as an index of discrimination are 
particularly well stated. 

The discrimination index used is Fisher’s z function based on what are 
essentially tetrachoric r’s. These z-values are then transmuted to a scale 
from 0 to 100. The use of z rather than r in item analysis is a new re- 
finement; how necessary a refinement is not clear. The indices are based 
upon the responses of the 27% tails, following Kelley’s demonstration 
that, for certain specified, and highly atypical conditions, this choice of 
groups is optimum. The practical advantage of the 27% groups over 
the simpler 25% has never been apparent to the reviewer. 

The interpretation of these difficulty and discrimination indices is 
hindered by the use of scales which assign different meanings to the nu- 
merical values used in the more familiar indices of percentage and cor- 
relation coefficient. Linearity could have been obtained without this 
possibility of confusion. 

The dependence of both indices on many factors, and their relation 
to expert review of items and careful sampling of the field are nicely 
pointed out. Greater emphasis should be given to the fact that neither 
index measures an attribute inherent in the item, but rather relationships 
among the item, the testees, and (with internal criteria) the remainder of 
the items. Davis does point out that validity may be drastically modified 
by successive selection of items on internal criteria alone. 

Other writers in the field are not always appropriately recognized in 
the text. For example, although Richardson’s work on item difficulty 
and test validity is cited, there is no mention of T. G. Thurstone or of 
Wherry & Gaylord. 

The most serious omission is the complete lack of any reference to 
cross-validation or to the need for it. The treatment implies that items 
selected on the basis of one sample and applied to another similar sample 
will retain all of their virtues. Item analysis capitalizes on any peculi- 
arities in the sample; thus a recheck on at least one other sample is neces- 
sary to give confidence that the findings characterize the population 
rather than the sample alone. Moreover, although a powerful use of 
item analysis employs two or more criteria, selecting items correlating 
high with one and low with another as a means of constructing relatively 
uncorrelated measures, is not mentioned as a possibility. 

Despite these omissions, this bulletin has a definite value in summa- 
rizing the uses and limitations of item analysis techniques and in pre- 
senting them as tools to be used with judgment rather than as a machine 
into which one can pour data, then turn a crank and extract a finished test. 


Charles I. Mosier 
Personnel Research Section, A.G.O., 


Washington, D.C. 
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Westover, Frederick L. Controlled eye movements versus practice exercises 
in reading. New York: Teachers College, Columbia University Con- 
tributions to Education, 1946, No. 917. Pp. 99. $1.95. 


The reading clinician is confronted with the problem of how much use 
is to be made of mechanical devices in his remedial work. There are now 
several of these devices available for improving reading by controlling 
eye movements. Their use, which has become widespread, is based upon 
the assumption that the pacing of eye movements will improve reading 
proficiency by correcting faulty oculomotor habits. These techniques, 
unfortunately, emphasize peripheral factors as fundamental in causing 
reading disability. But the use of pacing techniques, with or without the 
aid of mechanical devices, have quite uniformly produced improvement 
in the rate of reading. Although a few studeis have been concerned with 
the effectiveness of pacing techniques in comparison with other methods 
of improving reading, more evidence is needed. This report compares 
three methods of improving the reading speed and comprehension of 
college freshmen: (1) college work with no special exercises although 
members of this (the control) group as well as the other groups were moti- 
vated by informing them that their reading ability was poor, that they 
needed to do better to handle college work and that improvement was pos- 
sible; (2) college work with practice in special reading exercises; (3) college 
work with practice in reading the same special exercises under conditions 
of controlled eye movements. This was achieved by means of a special 
apparatus which forced pacing of the eye movements. 

The two experimental groups gained significantly more than the con- 
trol group, but there were no significant differences in the effectiveness 
of the two instructional methods (reading exercises vs. pacing). The 
finding that mechanical pacing of eye movements yields results no differ- 
ent from that achieved by the use of reading exericses is highly important. 
It gives additional experimental support to the view frequently expressed 
by the reviewer that just as satisfactory gains in reading may be achieved 
without the use of pacing techniques. When pacing techniques are em- 
ployed, the teacher or the clinical worker is too prone to overemphasize 
peripheral factors in reading and to neglect the more fundamental central 
factors of perception, comprehension and assimilation. 

This investigation was adequately designed and the data skillfully 
analyzed. The author is well aware of certain limitations of the investiga- 
tion, i.e., the training period was rather short; the “control group” was not 
a real control group since its members were motivated by special in- 
structions to improve their reading. This excellent study will be well 
received by those interested in the experimental study of reading, but will 
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be ignored by those teachers and clinicians who love their mechanical 
gadgets. 


Miles A. Tinker 
University of Minnesota 
Cantor, Nathaniel. Dynamics of learning. Buffalo, N. Y.: Foster and 
Stewart, 1946. Pp. x + 282. $3.00. 


In this book the chairman of the Department of Anthropology and 
Sociology of the University of Buffalo describes his own experience in the 
teaching of courses in personality and culture, and in criminology, by a 
discussion method which departs radically from the usual college lecture. 
Most college teachers in the social sciences and humanities will accept 
the diagnosis of the learning situation as Professor Cantor describes it, 
but few of them will have had his courage to adapt their instructional 
methods to the implications of this diagnosis. 

The American school system is charged with being authoritarian, sup- 
porting the status-quo, and rewarding individual rather than cooperative 
effort. If these charges be true, it is little wonder that our schools have 
been little successful in remolding student attitudes in ways required by 
democracy. To become democratic, one must live democratically. Un- 
less students are permitted to express themselves in the classroom and to 
learn there to respect differences of expression, Professor Cantor asks: 
“‘Why should they be expected as adults to believe in and make sacrifices 
for a democratic way of life?” 

As the title of the book implies, the justification for the teaching pro- 
cedure illustrated and advocated rests upon clinical psychology. Stu- 
dent statements stenographically reported from class discussions, and ex- 
cerpts from papers handed in, show how mechanisms such as resistance, 
ambivalence, projection, and identification are found in the interplay 
among students, instructor, and textbook author. The adopted method, 
which recognizes that the student must be given responsibility if he is to 
develop to an independent, critical, but tolerant position, has much in 
common with Rogers’ non-directive therapy. 

All the contemporary discussion in our colleges and universities about 
revised courses of study and the re-evaluation of educational objectives 
will be empty unless teachers come to know more about how students 
learn, and begin to appraise the development of students in ways not 
shown in the usual course examination. The testimony from students 
who have studied with Professor Cantor gives strong support to these 
contentions. 

While the book is not written for the professional psychologist, it 
should give pause to the psychologist who is also a teacher. Upon read- 
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ing it he cannot help but ask himself, Am I really teaching in a way con- 
sonant with what I know about the psychology of personality, mental 
hygiene, and attitude formation? Or am I teaching in the conventional 
ways in which I was taught with no more than tradition to justify my 
practices? If he should choose to teach in newer ways, he would find 
encouragement and instruction from this account of the success which 
Professor Cantor has had in the use of a freer method. 


Ernest R. Hilgard 
Stanford University 


Tyler, Leona E. The psychology of human differences. D. Appleton- 

Century Co., Inc. New York, 1947, pp. XIII + 420. $3.75. 

This is a textbook designed for use in courses on individual differences, 
differential psychology, or human variability. It is intended to meet the 
needs both of the general student and of the undergraduate major in 
psychology. Such a book, according to the author, should give up-to- 
date information, present the facts so that they can be readily assimilated, 
and show students how to avoid wrong conclusions. The stated purpose 
is “to synthesize and reconcile opposing points of view rather than to 
perpetuate old arguments . . ., to sort out the findings which stand up 
under critical statistical analysis from those which are in error or am- 
biguous, and to separate actual results from interpretations.” 

Dr. Tyler has performed her task admirably. The book is a model 
of clear exposition and critical appraisal of reported data. That it is 
up-to-date is indicated by the fact that 56 per cent of the 260 references 
cited are dated later than 1936. Some may even feel that the author has 
given too little attention to the historical background of recent research 
in certain of the fields covered. The index contains only two references 
to E. L. Thorndike, three to Stern, and none to Meumann. There are, 
however, seven references to Galton, six to J. Cattell, six to Spearman, 
and four to Binet. 

Apart from the introductory and concluding chapters, the topics 
covered include the nature and extent of differences, methods and logic, 
sex differences, race and nationality differences, class differences, age 
differences, mental deficiency, genius, the effects of practice upon differ- 
ences, heredity and environment, measurement of aptitudes, and the 
search for basic traits. The space devoted to each of these topics is in 
most cases between 25 and 35 pages, including chapter references and 
(for some chapters) problems listed for practice. 

Although on the whole the book is fairly inclusive and the allotment 
of space judicious, there are a few omissions which this reviewer believes 
should be made good when a second edition is called for. For example, 
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there is little discussion of differences revealed by the vast literature on 
personality tests and character tests. There is nothing on problem 
children, and delinquency is mentioned only in reference to sex differences 
in incidence. A chapter on differences in achievement scores among 
pupils in given school grades would have added considerably to the value 
of the book for prospective teachers and school administrators, as would 
also a chapter on differences in scholastic aptitudes at the high school 
and college levels. No mention is made of the enormous differences in 
scholastic achievement found by Learned and Wood. The chapter on 
age differences is confined entirely to adult subjects, without mention of 
age differences in childhood or the overlap of successive age groups in 
the earlier years. The chapter on sex differences is also confined almost 
entirely to adults. Physical differences are dealt with chiefly in their 
relation to mental traits while differences in rate of physical maturation 
are ignored. No reference is made to Gesell’s researches in develop- 
mental phenomena. 

The reviewer hopes he does not seem unduly critical in calling atten- 
tion to what he regards as gaps in a textbook that is so outstanding in 
its merits. The gaps can be filled in later. More important is the book’s 
superb quality, which is evidenced throughout in its organization, its 
exceptional readability, its critical handling of controversial issues, and 
its effectiveness in acquainting the reader with pitfalls in the interpreta- 


tion of data on human differences. Students will find it interesting 
and challenging. 


Lewis M. Terman 
Stanford, California 
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